For Reference 


NOT TO BE TAKEN FROM THIS ROOM 


oo be OF OL Pee Oe 


For Reference 


NOT TO BE TAKEN FROM THIS ROOM 


Gx upnis 


UNIDERSTCATIS 
ALBERTAENSIS 


SS 
University of / 
Printing Depa 
DOoROOO 


THE UNIVERSITY OF ALBERTA 


THE EFFECT OF DIFFERENT REINFORCEMENT 


SCHEDULES ON A LEARNED REVERSAL HABIT 


by 


Lorne Thomas Yeudall 


A THESIS 
SUBMITTED TO THE FACULTY OF GRADUATE STUDIES 
IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE 
OF MASTER OF SCIENCE 
DEPARTMENT OF PSYCHOLOGY 
EDMONTON, ALBERTA 


August, 1964. 


Digitized by the Internet Archive 
in 2019 with funding from 
University of Alberta Libraries 


https://archive.org/details/Yeudall1964 0 


— 


if, 


UNIVERSITY OF ALBERTA 


FACULTY OF GRADUATE STUDIES 


The undersigned certify that they have read, and recommend 
to the Faculty of Graduate Studies for acceptance, a thesis 
entitled "The Effect of Different Reinforcement Schedules on 


a Learned Reversal Habit", submitted by Lorne Thomas Yeudall in 


partial fulfilment of the requirements for the degree of Master 


= Ee 


Abstract 


The purpose of this study was to determine the effects of 
different probability schedules of reinforcement over a series 
of successive reversals in a two-choice learning situation. 
Five equal probability schedules of reinforcement were employed: 
100:00, 90:10, 80:20, 70:30, and 60:40. The main dependent 
variable was the number of responses to the more frequently 
reinforced bar. The experimental Ss were rats of the Sprague- 


Dawley strain. 


The results showed that different probability schedules of 
reinforcement in successive probability reversals had differential 
effects on the terminal response levels. In regard to mean response 
level the groups divided their responses into proportions which 
tended to equal or match the probability of the scheduled events. 
Individual 5s did not reflect the mean terminal response levels of 
their respective groups, i.e., they did not match their respective 


probability schedules of reinforcement. 


The general shapes of the curves were not congruent with 
statistical learning theory. They were discrepant in that the 
initial response probabilities were extremely divergent from 
their predicted ones, and in that they did not reflect the usual 
negatively accelerated growth characteristics. An error anlaysis 
of the response patterns suggested that a pre-experimentally 


induced strategy, set or mode of responding may have been operating 


tos7ewda 


% gfostty ade ecloretsb of Sev vbote, stat To Sapam = 4 
ony 
soleaa 8 ‘rave ihisassrotake lo saitbedios whirntadeage 

acttemtia soboewel sobodo-ont se of soeeeyes 6 ce 
: : A a : 
:poyolitay Siow Jucmecnc tats: to ealibeset vat bibdedoug Laupe evit 


oF 
nobasqeh atan od? . «08:03 Baa GEO" B18 OL: 08 POM | = 
rligsuns?t stom and of sgancqeaat So sede off caw of 


- aan i¢ to BGp% stow es leicsurisgte of? .sed bes —— 
> antl 
ihe? ‘, 8 it x3 ue a 
—__ 
lubution veilidadwop oastsTieh cane bomods etinney dP 
feitaorstrth bac a.sevavey ytilicadotws erimeeaie neh sasmonatnker 
aucgeer cuom 0. bvagut al .eLevel semageet damiguen ome no eioodte 
dotew spodevoootg otek asenoqess thet Debtyee aquorg edt Lover - 
agieve Lelvbedos ay 16 yr Liidedéw ed3 dotem tw Laups o behiwes 
19 efaval sooqas: ton lash gioe oft comkter dom BLD af Lavy 
witseqae: <tsi) rolsam cat Bab youd ,.9.2 ,equere evizoegeet aan 


 tremessotnte? to selubsilve ween tdadom 


ddéiy dookugoos Jon stew eave eae Bo egads favensa oiT 

ard Jedd mt tasqetou lb sviw ee... «yroeds gudaimel | 
myt ¢aagter lh viempedegy ony esloitdedosg rogue 
ipueuw sdt coolte fas DED year sand tah Saw a * 
eteyeiae rors mA a thie! 
Y Clad nesses a1. 6 be ry 


unkiatero ceod oved van sic 3 


Ayr. 


throughout the entire reversal series. The findings of the 
error analysis along with the aforementioned findings were 
discussed in regard to the inadequacy of statistical learning 


theory to explain the results of this study. 


vi 


At WW aeeii st <7 Jee ieaveves otsvne oo i, 
sisv eyntbel? ves 2¢e Siero te oir thy gaole aiegin . 
vet been cieo* Qo Andale om oe feat al hesseoe. 
ote wlth to ebkeblie odd fletgee OF aoeds 
a -_ 


@ 


o 
te 


Acknowledgements 


The author wishes to express special thanks to Dr. W. Runquist 
who encouraged and supported me through many analyses of my data. 
I also wish to extend my sincere thanks to Darlene Dardick, 
Elizabeth McLellan, Rosemary Robertson, and Pam Fuller for their 
"typing" hours. I am also very grateful to my other friends whose 
comments and discussions were of great value to the completion of 


this thesis. 


Lorne Thomas Yeudall. 


TABLE OF CONTENTS 


ie Ein Fidia a ee 6. 4 a'g “bin ack. 6 a 0.4)a:00 WR Od Lik 
PASE TLL Tat Ae PU SE BYiver ein en bac ceccacescvesvccsacecs Vv 
Rupe 5 Saas cae c sass gavecwaseadd daeasce vi 
pS ee I ee ee viii 
ioe il gina wis wad a dlele ba ave ond Watwwleels wd ew x 
INTRODUCTION 
Ree ecg Witiute go Ge yA Stele ais & Owe Sis GO We ew ews KN 
BOGrerek e  LeOLN IIe on as skie ews ceene new den ad’ de 
Dever ORS INE, ap cae W a éaw eee raw ele whee & o8 10 
pC Siac et Nee Aare at rar aera eer eee eee ge eae a LL 
METHOD 
Petree a ee kee a hee ee ae das Caw a CN HES HS HO Ow S 12 
er Ota ean Was Niwwarn ca eM wna eee be ewe Rees ne 
BUOPASOUULE: caches eae ede e6erBoaboewetenereceeu 3 
PUCCCUULG Si vaeeedklceeeotbe ene ces Peweee eed oon ne 
oO eee af Re nD 2 | a ge 14 
eA cia ween es odaed edd CORTE Reereees 14 
Beta ead ra 6 Sen ce ewok ca eee Cede é ver 16 
Probability-Reversal Training ..ccccccceccees 16 
RESULTS 
Reversal Performance ceseeseccrrenvececcveces a lyf 


Error Analysis Conpeeacgewe@geaenenvonsensven8 9 Ceo0ens0080 26 


ya tO 
Page 
DISCUSSION eeeoeeeeeeeeeeeeeeeoeeeneeseseevgeeeeeoeweoeeeee 0 @ 36 
SUMMARY AND CONCLUSIONS e@eeoeeseeeveecese? @oeevo*esaeesneeeeeoe & mn 


REFERENCES e@oeetoseoweoeee eve eeaeoeescveve ew eee eesee eee H6 


PEDO ERM. DME 4 odhwin wos e's seek een ase cee er cece ecceore aL 


\. thy 


Pewrerwrrrrrrr Terr rrr ier 4 4 


—_~ = 


pecs ueertesst et eee eeesees Ge bea ne 


, = 


of » > 
ewe tersevetsseeneneeseuees pawete 


ee 


LIST OF TABLES 


TABLE 1 


Means of individual S over last 8 reversals 
and t of differences from predicted mean ....... 


TABLE 2 


Summary of the analysis of the trends of 
Ay responses oeeeoeveeevoeeeseeeeeeenevnee eve eeoeweeeeeereeeee 


TABLE 3 


Duncans! New Multiple Range Test on the last 
Fe en ea aad a crs i are aad ah Gawiw hub 


TABLE 4 


Summary of the Frequency Plot of individual 
Ss AL responses over the last 10 reversals ..... 


TABLE 5 


Mean consecutive unrewarded responses before 
shifting to A, SSLEPSGECBDESA OO CEFF BOOS OFC? £C.O FT 2 2 et Be 


TABLE 6 


Mean percent number of times S used a W-S, 
L-S strategy pie! blocks of ten trials oer eeoeveev eee 


TABLE 7 


Mean percent of times the Ss manifested a 

W-S, L-S strategy to Ej] events in blocks 

of ten trials over reversal 1 to 5, 13 to 17, 
SNA EO CO Mlneitacetiouxtecdecdeewepaeseiacesees 


TABLE 8 


Mean percent of times the Ss manifested a 

W-S, L-S strategy to Ep events in blocks of 

ten trials over reversal 1 to 5, 13 tol’, 

RRR ORO Ney co ns cl wee cess tee eee bd Cees BONA we De 


viii. 


Page. 


19 


19 


cal 


el 


cM 


e7 


30 


31 


ie ae 


Page 
TABLE 9 
Mean number of shifts averaged over all 
ree i even sk @SA ahead soa heseae | ae 
TABLE 10 
Mean cumulative latency within sessions 
everaged over all “reversals *irisciceeiseiesiiisi: || 34 


TABLE 11 


irencsenielvere OL LEGency SCOTES Coe Fen ses eae ees 35 


i an 


We 


, -ova Degereve eftide to tedmiin aeeM rn 
ee . eee) Pe aisetovet 


og atau 
- -. « 
sos mbttiw yoestek eviteioms ash 
eves. pyar Ube ssvo Sageteye: aay. 
omen 
eerosn yorehel te ebeytens psx? LV we 


st) 
aos BD 


, 7 


LIST OF FIGURES 


Page 


FIGURE 1 


Mean percentage of A, responses over all trials 
meee BEL ct 0. 20): acintid aie « Sbneis 6 Wade eeed wesw pan LO 


FIGURE 2 


Mean probabilities within days averaged 
ee Ve aL e hh GCh cide a oe o eaae bk 6S bike eek elem a5 


FIGURE 3 


Mean probabilities within days averaged 
ee ek ol, gen aon winlnde comes ae steininanen 6 a4 


FIGURE 4 


Mean probabilities within. days averaged over 
ree ee On guest ane esv eso ss naedweeseees 23 


FIGURE 5 


Mean predicted and observed Po values 
pee meveepare hte 5 ol eas ieee) Mameee ie ew ae 


FIGURE 6 


Mean predicted and observed Po values for 
eee LT 2, cider ebndbessicenssovsses 24 


FIGURE 7 


Mean predicted and observed Po values for 
reversals 26 to 30 eoweeevreveeoeveeeoeve ee ew eave eereeeewe ene 2h 


FIGURE 8 


The 8 possible response alternatives to Ey, 
and Eo events eevee #weooeeev oa eweveweee ewe evoeeveeaeeevr ew eeeee 29 


*t. . * 
Hayes 
cr 
ee e ee ese ee eweeeseoeere 
ay x! t 
ioe” 
a . “fee eee 
[SVT 7s 
c 
_~  *#£¢ & #8 *-* 5 ses ee? 


saGlsy 


ws eae ee eee 


tov esuley ct havesado bas 


out ebtiinns euseetnav«eses teen 


AS 


tod abeley of osvesedo. ae Sai 
“3S cessanananayenersvescensnant OF 0 


: hot savttanrsele ' 
Ya sessechenanenaantentonss sae 


iL 


cepraney ..- VE of t Bhapzevest 4 


« eur . 
: ovsb duiosthiw eelihiiidedor mash 
ay . &f of EL elaetevst t9vo. 


ya oyeb ablditir vets ti kdhede 
csacerecnee OF G2 OS 


+ ph syagapoxed 6 


ny Sart 


~~ 


svisndo bas betel 


an. 


y= 
-_ 


Page 


FIGURE 9 


Mean per cent of times Ss manifest a 
W-S, L-S strategy to Hy events averaged 
Oe Als! PORN GELS 26 a ak cred ceed « Levees ees Baleares Fe 30 


FIGURE 10 
Mean per cent of times Ss manifest a W-S, 
L-S strategy to E> events averaged over 
ll reversals eoeeeseoeevoeeeeeseeeweeeeeeeeveseseeeeeeeeveeeeee 3h 


FIGURE 11 


Mean cumulative latencies within sessions 
averacen over all ‘Levyersals incscwsavondscewvunne a2 


a testinem 2 same Te veo tog nmacM 
{ove RITSVS , or udu Ged , a=W 
eeceew eee Peevey etagronys is 10 a4, 


Jal wads 


ani ¢ taetimen 28 eaakt To deed i9g poe ; 
190 begs | vse ex od yaorerss a ; 
hacer gene oul ed reads Oaieeane aSewrene®’. 


*e*eerenrree eee? 


otesse middie sstanedal oy ictal diaoK 


Fe} Wwesivie <i aseaeeen ro Rae 0 ee “f 


“aise © 


Introduction 


The purpose of this study was to delineate the effects 
of different probability schedules of reinforcement over a series 
of successive reversals in a two-choice learning situation. 
The present study was prompted by some of the theoretical 
implications of Estes' statistical learning theory for behavior 
in two-choice learning situation. More specifically it is 
concerned with the extension of statistical learning theory into 
amore complex situation. The empirical basis for this study is 
derived from two related areas: probability learning and reversal 
discrimination. The relevant background information from these 
areas, based upon research with humans and animals, is discussed 


separately and integrated later. 


Probability Learning 


In statistical learning theory, behavior is seen as an 
essentially probabilistic phenomenon. The primary behavioral 
measure is taken to be the probability of occurrence of a member 
of some response class. The types of apparatus used in human and 
animal experiments have usually differed from experiment to 
experiment but have been designed to permit the S to make one of 
two possible choices or predictions on any given trial. Depending 
upon the procedure being used, the S, after making his choice, may 
or may not receive feedback as to whether or not his choice was 
correct or incorrect on that particular trial. In the "noncontingent" 
procedure, the information given to the S is independent of the 


choice or prediction he makes. In other words, regardless of 


} ’ - 
f+ seeontish of aah ‘bata olnit To cect MS 


x %. =) 


suo? Te 
z4tusa o tavo tosmeordtalst Ww eelubaios . okitdedong @ Be 


@ 
+ 
tw : 
= 
> 


notveutie yohewes sodvdo-ow 4 ab Qisemevet evieded 


les itstoelt oft Yo supe ed Sedma ae noon seeing: ay 
La 


> 
a on 
ro tvs: ivi Vroerts noain rceel Lenits.boae € ‘asta vabassies qk 
a‘ 
[Laniticcws axGM .ooktasttis ankesoet amesants xt 
tol wroedt potawel feoktelieta to coleaantne edd a 


at youge sid ot efeed Leotubges oT online 


acid mot aoktemvotm2d Sawgorgalba i J0nveee aot seers 


paaastoe th et .zlamias Boe anand oo iw 2 ioasaet noge beksd 2894 18 
7 


ero & 


reteal betsragednt Sas 4 


ie 28 1598 ak tobvedted .yicelt gakpet ee 
wotvated-yteming eff .nonsmneng olseLthdadorg % 19229 
+16diem 6 lo sunetteseo To yrilidsadow ad o¢ of neat a ne a 
bos dest si boaw autetegges to esqyd ef? penlite -enciodeert ‘ee 
ot dhamivegqes wit betettip Lautan avast adusaibaay 


Yo $yo saisot od 2 ame aes ot somata 


yem ,ooiodo aid gx 
2sw sSiods ait tons 
"susgatzacosos"” et al +E 
edt to renee 2 

to sesliregex ,@ 


what choice or prediction was made on a particular trial, the Ss 
is informed what choice or prediction was correct on that trial. 
In the "contingent" procedure, the information received by the S) 
is dependent upon the choice he makes on any given trial. If he 
predicts the event that was programmed as the correct event, he 
is informed that he has predicted correctly; but if he predicts 
the other event, he is simply informed that his prediction was 
wrong. The events to be predicted are usually designated as Ey 


and E5, with EK, as the higher probability event and E, as the 


“I 
lower probability event. The two possible choices that can be 


made to Ey and E5 are designated as Ay and Ay respectively. 


Since 1950 there has been a strong interest and intensive 
development by various researchers in the area of "probabilistic" 
learning. The first research on this problem was a study by 
Brunswick (1939) using the white rat as an experimental subject. 
Brunswick, employing a standard one unit elevated T-maze, studied 
the rat's behavior in a two-choice left-right situation under three 
conditions of equal probability schedules, 100:00, 75:25 and 67733, 
in which the animals were allowed to correct for errors. He found 
that the 100:00 and 75:25 groups! final asymptotes tended to equal 
the probability of being rewarded for the left and right choices, but 
found no evidence of learning in the 67:33 group. Humphreys (1939), 
using humans as Ss asked them to predict whether a light would 
or. would not appear after a signal light was turned on. This 
experimental design was meant to be analogous to his research on 


conditioning of eyelid responses to a light followed by a puff of 


a 


& oft .ferid wlsoiing & iH aben sew su ivolbesg tO slots Sar 
rs : _ LS = 
tokbare so estorlo tadw Be ro tent 


Leite née no doors saw 405 


i 7 % ’ 
~ ait vd peyv i meat fr is ane 2 ort ‘ aypshsnosy "Sstoge btu 


a° tL igted savin Yon 2D 846268) s¢ sokoto ont coq: anobaege | a 
s% ,daeue dooeiso sit Be hommesgoig of and seve 98 sdokboug ' 


‘Se 
atoibora sd th dud iyiteormoy belothssg eet oe ait foniih , 3 
Are a 

aBW meivoibete a 2! f° bonrrots: vigmia of aff, are vetto dy 


T se bolemmisal) yileven sta bersibor af oe ed 
par 


gyfensint | - ceavognti anorts « assed wan orecit ORL e2 7s) 
“Site kEtdsdery' Yo sears sit ci etsdosssEet ato bisy ee : 
vd ybwite 2 asw ool dead sist x Detar gertt: ‘uate! pate 2. ; 
j . 
tostdue [eiaembisqxs «6 gh ye iiw orf mena’ (QERU)) i bwec 
LP .o > =, 
ba thie. sead-T bSdaveia zt feat ‘eb teesl 
seit tebim aoitsblie scgdy<7e4 
eLewyo Bes 25:27 ,OCG20G! .aelubedsa 4 
Davo} sif*-azonts wii Jostio. 3 oon a a y 
feups of bobaet seseigquyea lontt ‘sqiiowy, 
; 7 
dud 209 tono digit bas ateD oy 10% bebuawem ss 
(28 et) eye ortiq mii a ery ond ith on 


¢ ate ad atl “3h 


bissow wri.tf . stat 


ase 
ehar Looser 3 an 


nk 
0 doyaous Zits 


3. 


air, in that the signal light represented the conditioned stimulus 

and the light to be predicted represented the unconditioned stimulus. 
Thus, the Ss‘ guesses (anticipations of the second light) were defined 
as conditioned responses. When he tested the Ss under two reward 
series 100:00 and 50:50, Humphreys found that Ss divided their 
predictions into proportions which tended to equal or match the 
proportion or probability of scheduled sequence of events, i.e., 


100:00 and 50:50. 


Interest in probability learning has increased greatly since 
the original studies of Brunswick (1939) and Humphreys (1939), 
resulting in many empirical investigations, and in the emergence 
of various formal mathematical models designed to explain the 
results of these studies. Estes (1950) added impetus by trying 
to formalize some of the basic ideas of learning theory. Bush 
and Mosteller (1951) started to investigate the use of Linear 
operators in analyzing learning data, and since then various other 
models for behavior (mostly with humans) in two-choice and multiple 
choice probability learning situations have appeared. (Anderson 
and Hovland, 1957; Davidson, Suppes and Siegel, 1957; Edwards, 1954; 
Siegel and Goldstein, 1959). This study is primarily concerned 


with the Estes’! model. 


In the Estes' model, the stimulus population is the central 
concept. The stimulus population is supposed to consist of a 
set of elements from which the S samples stimuli on each trial, 


and in turn these stimuli are connected in an "all or none" 


a 


euimitts Seaokttheno add bed ree teat “desks Lem ta ott th 


- $ beeen iets te 
eriuekte ban co! LDEO sie meek 


jf 7 7 4. ‘ oak 4 
heankieb avsw (dig tl orooge oor 1 anc Leg to Ltae ) 


~ 


Dt \ "1 ah 


~*~ 
~ 


e ots , adi 
ow iF. od VIS 

| yt ! } 

, a & 

. at oye 
ony Ls 
Bir y= %° 
fra ASE \ 
Pipes 


‘ Fal bee Crear 
I Pada SiO £ wd 


sfoislva Dos 


: a 
_. foe 'reabrk) 


jugOi  ,ebhtavhbe 7\eel .fogske Bawa 


bomtrsonoo yiitantur ah 


gau wits ain breeviaed: Oo pavunte ( wae ie 
font ~o002 Dae (ata eoiesed | 3a. os 


.bormagqe svsd-e 


Liga Og babast ds nity anc Lyatoxehiae 


re 


“ws 


bodies ange babs thay, ad © 


ae) fed? BAuoT | Haag i 


- , = fas 
4incvpee belubsrion Go QOL AGRE 


an briael ami Tae 


qn base (Chel) oboe — 5 
ey 2 a a ra? 
mB ,anoting Bee Leen Gee & 


et Bs itm fa St) Lehom feo iemeddam LS 


ql bobbs (O#GL) eatelt psa. 2 
aitrrissl ie deeb, otoad 3nd 0 9 


? 


4 


? 


ae | 


: (examin 


- 


6 


fashion to the response just made. Each element is conditioned 

to one and only one response. This theoretical formulation of 
Estes has come to be known as "stimulus sampling theory". The 
basic notion of stimulus sampling theory is the conceptualization 
of the totality of stimulus conditions that may be effective during 
the course of learning. Any element in the stimulus population 
has equal probability of being sampled by the S on any given trial 
and these trial samples are drawn randomly from the population, 


with all samples of a given size having equal probabilities. 


Estes* model assumes that only some of the stimuli are 
conditioned by the S on any given trial, and in turn these 
sampled stimuli are connected or conditioned to responses made 


to ES or Be events. Thus the probability of a response occurring 


on any given trial is equal to the number of sampled stimulus 
elements that have been conditioned to a particular response, 
divided by the total number of elements sampled. For example, 

if the experimental design were a two-choice situation in which 

Ej; = .75 and Ep = .25, and the Ss were run for a thousand trials, 
Estes':model would predict that the final asymptotic response level 
would tend towards .75 and .25 for A, and A, respectively. This 
type of behavior in the two-choice learning situation has been 
labelled as “event or probability matching", that is to say, the 
subjects will make 75% of their responses to one choice and 25% 


to the other one. 


Learning in Estes' model can be depicted as a process of 


random sampling of the stimulus components which become connected 


| re ee 


benotthbues sl treneie dosd aban sear, aiTOE 
e 

to ‘seldalumot Leotiwtoed? alifT Tous oho % 
: +o aa 


edY ."yxoo gatigmea eukumisce"” eas moins. od" 


woideathesigecnon si7 2i yrosmd goblgmes af 


* P , . . © 7 _ — 
arith svidooris. sd yan dent eqo iene RULEEES 2 t¢ 


: _ ——T perl tree ber: = 94 4 07 foems ts i 
LAA IJ RAMION TA de det Maks =| GRY bode ab ae ae 


e% a it ake ee rq Le us ' ; 2) 
é 4 Fm | TJ is - r ri iD @) i ie. 
gS Sis, it" pn ie 4} 6 , 
Hs asotogas't po Mop oe) be wt 
T2 
~ - 4 ~ e 
ST eaiae § Wha} 4 } he - if, a . © Vu 3 WeOGE 


ailumtt=s Selaimes to ‘SSsomuen estt we i 


“ 


~senoqest wallcicseq a OF batts 


-tiqmexs sod ,oaigmese oF ait As tb, i 


> 
- 4 ss. 4 as 
Holiw sf coifsutiae sof Sto opel 2 oraw, ag 
<a 
’.“@ 


coLetis basevoid 2 ToT mrt exw #28 
isval panogadt sitotquayss tontd-ed? ¢ shite 
ait ..ylevigssaqest A Dine : 
mead asi nn ttade te ‘ 

add yee of ak Fage® é aoe te 


Peo fits 59 Loti sao tt pe ne a 


te whetas 1h 


botasatiod: oid 
ial 


De 


to responses in accordance with the probabilities of KE, and E, 
events. The rate of learning, 9, for any one S is equal to the 
proportion of stimuli sampled on each trial. From this it follows 
that the amount of learning, or rather the increase in probabilities 
of a response per trial, is a constant fraction of the amount 
remaining to be learned. Estes (1954) purports that, for any one 

S, 9 is a constant which does not change throughout an experiment. 
Due to the abstract nature of 9 it can only be determined post hoc. 
Once 9 is determined from one experiment however, it can be used 

@ priori in similar stimulus situations. The prediction of a S's 
performance in regard to both the shape of the learning curve and 


the final asymptotic level of responding is determinable from 9 


for any probability reinforcement schedule. 


The empirical findings of probability learning investigations 
in two-choice situations have both supported and conflicted with 


Estes’ probability matching phenomena. 


A number of experiments (Detambel, 1955; Estes and Burke, 
1955; Estes, Burke, Atkinson and Frankman, 1957; Estes and Straughn, 
1954; Gardner, 1957; Grant, Hake, and Hornseth, 1951; Humphreys, 
1939; Jarvik, 1951; Morse and Runquist, 1960; and Neimark and 
Shuford, 1959) using humans in two-choice noncontingent probability 
situations have found that Ss tend towards an asymptotic response 
level equal to the probability of reinforcement designated by the 


experimental design. 


_ 
. 
a - 
a4 
Le he 
a 7 6 "a - a _ Sl 7 ~ . 
@ Bas jB to poliilivesowy sf? chibi soneiapose ates "7 ast ot 


; 
i oe 
att of Lapa ch & sae Vite 18h 48 maine to ates —_ a > 
7 ee - 
ewaliot 3g aint mov lela Qed oo hefgaee Lismtse To a ie 


esittl }ledoiwg fi »aceforl ot Tenet giants sod ony 38: 
ne 

Snweme ak ‘to Hettoss? adatoms Bae Laut oy secogest hk 
sno “yas not J Asc eotoqneq (AL) oe sanmeiaee: elit 
hombises! ws ddoddiyri -eeoens ae Kosa deities #788 OB 
.oud Jeng beitiertss af Yleo_ mes 2h © Ip eee seacten taal 


a aad 


Beeu of neo st ,azsvewid icsmbisges sao a6e2 


' a A aby Jord “eae ~>tarce'? 
7'2 pité poLio terry off .saesteu ts amiaeiee 


C mous sflentmretsb si anibigqeas to Jaye seme 
elubeise ‘cons oieter yi ihdedbag yaa 
7 La 7 : 


= 


a0 ETO EG a5v7Eb 2 chivet at fo EL tedoesy tc 


iWitw Det Cito ome Dserte 


sav fue 2zaced secel rey 
tripe rit, his aste9 :fcer. raancteay 
aysingquil $feel ydteeciol pee fas 

bre Aremtel Bea Oce Pl , 
ytilidsdow Jmegoisnoonon 2ofG as 
penogsat Jiserginyes te, ae £ 


ett <d bavangtuok diame. 10 Lees $ 


4% s tee 


6. 


However, other studies (Bush and Mosteller, 1955; Detambel, 
1955; Edwards, 1956; Goodnow, 1951; Neimark, 1956; and Siegel, 
1959) involving a contingent procedure have demonstrated conflicting 
results. They found that Ss tended to "maximize" the choice 


associated with Ey events. 


The above studies suggest that the conflicting results 
(matching vs. maximizing) seem to be a function of contingent 
VS. noncontingent procedures. This conclusion is far from 
positive, as Gardner (1957) found no significant differences 


between contingent and noncontingent conditions. 


Another approach to decision-making situations was formalized 
by von Neumann and Morgenstern in 1947. Their theoretical game 
model predicts that a person will learn to maximize the expected 
frequency of correct predictions. Game theorists view probability- 
matching as an "irrational" strategy and believe that Ss should 
eventually settle on the "rational" strategy of choosing the more 


frequently reinforced event 100% of the time. 


From decision-making theory, Davidson, Suppes, and Siegel 
(1957) and Edwards (1954) have presented a hypothesis of "maximi- 
zation of expected utility" which seems to account for both types 
of predictions, probability-matching and maximization of the more 


frequent event. 


Edwards (1956), using human Ss in a two-choice learning situation, 


demonstrated that the asymptotic probability levels were higher than the 


5 l“‘tlen ts On Daa (yer): pune 7 


| 
Tra GBW aco einie se 2 Ue 0 taaodb oF ee — ak 
: af 


Fu ** SA [oot? 4*; 


stacy "aust 


Te L 206 ily 


gyi? baa. seqque ,7oOsP oye 
=~ 

io ataaditogy & Saieeees 
: : 


. SAVOODE Gt 


ua ih or os 


“nthe —.. . 
a? 


{> ~~ cag tnesoes ites «itn 103 he ‘ = aad 


oe i 


= 


a » wots tm a 
sat. Div "me Thanh if | 


- a oy -_. 
ong aio Eta 


” 


Jue Yesdeate: a 1S 
nhs a: 
a * a 

- se 


rag aa 


t IX Ere tfes ta 


sf3 msds ved view af 


T. 


probability of reinforcement when both the probabilities and the 
amount of reward were varied. He concluded that both probability 
and amount of reward are important in determining choice behavior 
and that it is possible to compensate for changes in probability 


of reward by reverse changes in the amount of reward. 


Siegel and Goldstein (1959) tested Ss in a two-choice situation 
where they systematically varied the utility of the correct event 
as follows: (a) "no pay off" condition in which the only reward 
involved was feedback in regard to whether or not the subject's 
prediction was correct or incorrect; (b) a "reward" condition in 
which a monetary profit could be attained by a correct prediction 


"risk" condition in which a monetary gain 


of the events; and (c) a 
was involved with a correct prediction but also a monetary "loss" 
with an incorrect prediction. Their results showed that the 
probability of predicting the more frequent event would tend 
towards unity as the rewards (positive utility) and costs or risks 
(negative utility) of correct and incorrect predictions were 
increased. Thus the utility of the situation is defined as the 
"subjective value" the individual places on the outcome of the 


task and has resulted in the general hypothesis that the S will 


maximize expected utility in any case. 


Conflicting results have also been found where animals were 
used as experimental Ss. Brunswick (1939) using the rat in an 
elevated T-maze apparatus with a noncontingent procedure, i.e., 


the rats were allowed to correct for an error, found that the Ss 


we Sy Se 
_ 


i? 


vwtpiidedeor gtos faut S-btastos & 
4 ; = a » 
yobveded eplons gohats sh Gi wu i 


afer? OF altD | wr 


VIT{EsAUO TD a. Bagasis 
Ld - 


¥ aw ie’! ee ph iS Teh i tr gamma 4 sa te 
no tvend te ac tenj-owl 6 he io a ee id (ie % i) he a Sabecmeegil % : 


tHeve tosriod ort ‘to gtiiio avr baltey ole = 


5 ion ent daiaw a: Lt Lbi 
frawot yloo eit doin 


f rer ; Pp Oe any ra ‘ 
BsOu wists ae ne clase WO iC SLEPT « atte ri 


att cedd bowode atiuaet vieal co tbat 4 


boot pilitow tasers dusgpes? 
aus in ue = ; z gD ‘Je! L se tt Of ry da J at 4 y ; eal ‘ mati 


saw enctpbberw dsexcaeont bre 
sil7 a8 Be: 56 st wotramtie wi 

ait tc atooieu ont 09 aeocig <a 

Il iw s aft cant alaptogg Lateas@ & 


2beo we aly 


- yaw cnet sor Eutlio? chet osha’ s g pee it 
ns chutes ait, aoias (eet) Sperenay 
7 ay 
we eubeae 72" ‘aeepttio st a 
: Fah _ ‘., mm 
add hae «vote: aoa 


_ ‘ 
> K 


_ 
: 

y 
Js 


-s 
“vt 


, 
iy 


tended to match probabilities under various reinforcement schedules. 
In the light of present-day experiments, which usually run the iS) 
for 500-1000 or more trials, Brunswick's data are probably somewhat 
inadequate as they represent only five days of training. Lauer and 
Estes (1954) trained rats in a T-maze using a noncontingent procedure 
with a reinforcement schedule of .75 and .25 randomized with respect 
to both trials and daily blocks. They found that the asymptotic 
response levels were distributed around mean values of .75 and 

-25. Uhl (1963) using both contingent, i.e., rats were not 

allowed to correct after an error, and noncontingent procedure 

in a two-choice bar pressing situation, trained rats for 1,000 
trials with probabilities of .90, .80, .70, and .60 under four 
reinforcement conditions of sucrose concentration, 6, 12, 24, and 
48 per cent. Rats under the contingent design were more efficient, 
i.e., the Ss final asymptotic response levels to EL were higher 
than those in the noncontingent. In both cases the 8s tended to 
over-match their respective E, probabilities, with the noncontingent 
group the closest to demonstrating probability matching behavior. 
When the four conditions of sucrose reinforcement were compared 

over the last six days, no significant differences were found 


between the asymptotic levels of the four probability groups. 


Uhl's study contradicts the utility models of both Edwards 
(1956) and Siegel and Goldstein (1959) which were previously 
considered as explanations for the conflict between probability 
matching and maximization. Uhl concluded that approximately 1,000 


trials are necessary to evaluate the effect of different probabilities 


on behavior. 


~ 


to _ 


tes “t yh 


7 wat. 
ye 
; - 
- a oe co ei . iy 
BY et or viiavas coiuiy , 4 \oalsegee SPE 
sfwermpe vyiiadors ois Stal a’ xoiwamrte , aad on 


7 at nr, P a 
BOE avec gmretre tars 26 TVA ie oe | its re scene \e sas 


asiobetion tasmsoso title si ieV <tbmur) 


ae 
_ 


hin ¢ tc £aglavy agam Bova secede 
ton oyew fiat 1.8.1. coesmioaie sited peuten ' Fe 


as ens aa | 
: 7 — 


i: 


7 
' 
,o5® 


S 
al 


: 
ss tree Be 
a 


¢ ~ sry owe i ss a - fe J : 
<0% e2¢y Somber? rotten aie Se 3 
ors. 
oe Pe Pr : 


poe 0S (Sf .0 nolsandassmerseersde saat 
trate l®da soc! susw mates Snagriaod Sam! ee a 


ty 


« 


vat oect: SF od alewet sanngee: Sige yn 
he 3 

i : ar 

bofires 22.803 eases Stod aoe 2408 ote 

§ A 7 

JnsadtJnoasen sy SFiw oe LH edo hh sake 


fofvened an idovent %; cde aaa 


bs teciuoS 2 iaw ‘ence a 


Ae 
abtawba tidod Yo elsbor wy hei routs et) 


erie: 
idl gy ee 4: re 


apolvony, stow it ee Jee. ) 
yo Et haa naewrad 4 


O00, f lee j 


9. 


Parducci and Polt (1958) trained rats on a single unit 
elevated T-maze under both contingent and noncontingent procedures 
with E) = .85. They found that the contingent group maximized to 
the more favorable side whereas the noncontingent group manifested 


probability matching behavior. 


Stanley (1950) using equal probabilities of 100:00, 50:50, 
and 75:25 in a contingent situation found that rats in a T-maze 
tended to approach an asymptotic response level of 1.00 (.917, 
.958, and .917 respectively). The low value of .917 for the 1.00 
group was a product of the experimental design used by Stanley in 
which he controlled the number of rewards instead of the total 
number of trials. He also used a matched-litter technique and, 
as a result, the fast learners in the 1.00 group reached the set 
criterion within a few days and were dropped from the experiment, 
leaving the slow learners as strong contributors to the final 
asymptotic level. Bitterman, Wodinsky & Candland (1958) also found 
that rats maximized the higher probability choice when run under 


a@ contingent procedure. 


It is clear from the aforementioned studies on probability 
learning that Estes' prediction of probability matching has both 
confirmatory and contradictory evidence. The common element 
to all these studies is the simplicity of the learning task 
employed, while their differences are concerned with parameters of 


reinforcement and acquisition procedures. 


nr 

- = ; 5 s 

Sind slgala = mo ate Pere ye 
7 : 


wee aii JSTISNT firnoenion Sra JnvanSte { ‘a 


i .atue . oe. fe wees. 
ef Bess inital Usivas PReeaclcrnoorsi TEs 
baveas ital g 3 "SWI AORN Ba 


P08 ,OOrOOL Je geek Llidecesg “age 


+) ae? De LO [Sve LJ *); ITA 2 Dee 
$ss adc Dexovut qtaty OC. fant ee snstrte Sot 9 


~CSeomCIaxe « wrt Beye ch. aew bie sib who 
: ne ee 
[AnL. sisoGd’ ane iP og tr ie ytoxse a6 Be . : wo 
or 


boint cela c I2Cl7 bite Lyte: a \aeietbow 


cuba fei tfecw satons’ 17 LE baa, 


e . » ia 
Vihiidsdow oo.2zabivor Dace 
. 6 


dtod sa getrfoJdest 1 LoDo 


ome 15 nuns ae 


pi 


Reversal Learning 


Most of the studies on repeated reversal learning have 
indicated that Ss manifest a systematic decrease in errors from 
the first to the last reversal. All of the first studies 
(Krechevesky, 1938; Buytendijk, 1930; Williams, 1942) demonstrated 
this general trend, but differences were noted in the Learning 
rate and final outcome of the reversal performance. Many 
inconsistencies that are evident in these earlier studies can 
probably be explained by the use of different apparatus and 
procedures, and by the different theoretical interests of the 


writers. 


Current research has tended to focus on factors that influence 
reversal learning, such as massed vs. spaced trials (North, 1950a 
and 1950b), number of trials per reversal (North,’1950a and 1950b), 
correction vs. noncorrection (North,. 1950a and 1950b), trial vs. 
performance criterions (Dufort, Guttman, and Kimble, 1954; North, 
1950a and 1950b; and Stretch, 1963) and effect of overlearning 
(Brookshire, Warren and Ball, 1961; Capaldi and Stevenson, 1957; 
Mackintosh, 1962; North, 1950a; North and Clayton, 1959; Pubols, 
1956; and Reid, 1953). Although there is some ambiguity as to the 
theoretical explanations regarding the effects of these factors on 
the final outcome of reversal performance, the general trend of 


decreasing errors over a series of reversals is a consistent 


finding. 


oval yokirust Deerovet Redasgen ao asthe Jt: 

mort eto ck saeampeb ohjemeteys & destined of) 
shibuse darit ott to LIA .iepeeter sae: 
botartenomsb (SUOL- sme tir iw ;C2CL ,z Lease oe ~_ 22a 
tiivael oft nf Seton erew ReoressT LE sud bags 
tan sodiepiteg Leamye: sit kk ase 
map aolbcte “is bbyas seed? of caubive 25 aon 
bis amvetagge direvotte Jo sav ay! Yo 


ath Bo APRS FAL far tdsrostit PoorsTaib aes ut 


- 


siievlind tede axcibat mo kuoot oo Subrsd B88 dotion 
a0wer ,d¢dio8) aleiat beoage. .ay ivenom af SOB. 
Lael Drs BOZRL cid two) [eseavet + “y tated Neem 
~ 
ev Leiit . (Q0CRL Bis Atel, ,d2t08) >t Seo sie 
Bn 98 

punt : wb > < be “ . 
Agiott: PRO PEP es <0 Dae ~AMSLA) suotatl) | nie, 
‘ an PS * <P - a 
nalgraadzsve a los baa (Eder - x tert nT. 


—- 
-toputevede ie Cea fF .flae & 


“~Aae? 


i 


The Problem 


In the studies achieving probability matching, the task in 
both humans and lower animals has been a relatively simple two- 
choice learning situation, as a left or right response is 


consistently associated with Ey events throughout the experiment. 


The reversal learning problem was chosen because of its 
Similarity to two-choice probability situations and because it 
is regarded as a more "complex" problem for the rat than a simple 
left-right choice discrimination problem (Koronakos & Arnold, 


LO5%,). 


The reversal problem provides "complexity" in that the Ej] 


events are not consistently associated with either a left or 
right choice throughout the entire experiment. On day one of 

the experiment E, events, the higher of the two probabilities, 
are contingent upon "left" choices, but on day two they are 
contingent upon "right" choices. It has been demonstrated by 
various investigators that the S's probability of predicting a 
given event, over a series of trials in which two alternative 
reinforcing events occur with fixed probabilities, tends to approach 
(match) the actual probability of the event. The experiment was 
designed to determine whether "probability matching" as predicted 
by Estes in a simple two-choice learning situation would occur in 


amore complex two-choice learning situation. 


ni Mas «ach om 


-ows aloere Loy 


a F. 
Jrnisn fy 
\ 
i TY 
. 
i § fa wy) fiz J 
j 
a, 44 ; 
- 
Try t T 
; ue te #8 
; suo |! 
a4 ‘| 
2 (= 


A ace otbsid 


Svivatrstis owl onieabie 
a hnat soot Lai wa 


y owd, a “Eopweeeiit wit? s 


wl yAD ad ou ebaolody Seales 
“VV 


0 bateridaeciea ned ait Gi. ssatode 


isiten yo LL bisgot sa byes! ea fe ae 
gy aim - 
Su fay 2 jfecq°s oe 2 Ladihcs noe ' a2 


on A) we lor 4 apr ae 


wr 2 : 
ie uitiw Bacetioees Wisnereheion dos 


ac ae +P TD f @h vay hat 7 4 


: 7 
yi 2S sane 0 sie fucka 7 -- 


a6 
4 7 
« = : ” 
\ tae med go.g an ssaeie: pe 
‘ ~*~ ee ie % St 
7 4 


ely n Velicd for: sande 


_ de fico 
At Ve) seiderty ‘sega i Se bab wags 


t 
—— aioe 


‘ttuteeneorlins sdd james 


= a i 


> vila nat 


i 
Method 


Design 


Thirty rats were given reversal training in a two-bar modified 
Skinner Box until they reached a set criterion of reversal performance. 
Following pretraining they were randomly divided into five proba- 
bility groups which differed in terms of the probability of rein- 
forcement (.60, .70, .80, .90, and 1.00) for the correct bar on 
all later probability-reversal training. Probability of reinforcement 
(P) refers to the more frequently reinforced response. The probability 


of the less frequently reinforced response was 1-P. 


Four sequences of 40 response alternatives were prepared 
for each of the five probability values. Each sequence was 
determined randomly within the restriction that the appropriate 
probability be maintained exactly over blocks of 20 trials. 
A random permutation of the four sequences was drawn for each 8 
for successive four day periods. Probability-reversal training 
consisted of 40 trials per day for 30 consecutive days. A non- 
correction procedure was used in which both bars were exposed on 
every trial. The bars were retracted for 5 seconds immediately 
upon depression of either bar whether a reinforcement was received 


or not. 


Subjects 


The experimental subjects were 30 male albino rats of the 
Sprague-Dawley strain. Their weight at the beginning of the training 


ranged between 180 and 200 grams (60 to 80 days). 


> See 


a 
fet? cbom: 1s S. iol Sea LS isVo Sorly see 


Me 
somsiro'ines Leeson 4o scltat io tea See nes 


i. 
‘ 
i 


-adow evittdal bebl ‘(Limaboe ssw Yack go 


: Sa. sue r of $034 erent son 


‘ > - tt ~+—.- > 
Ye ds DOI] Ath “oye Ss) DOPTO Bisset yisiek 


i~i @aw Soigect bert 


bovecour oyey Sor Bek etle sencquey Of Bive 


ow soneate soe vaoulsy > vo LE es i 


e _ . _ 
otetaqorqye odt jade bobereed edt ree . ib, 
2lenaoS ty amas veve yltaaxe fsa 
= dose aQ2 ceed sie epotapor “Wot ait te ne 
aciniess I: overage bthendine tokea 
-norm A -HY20 SVEE 
} HSS0G4e 
iiose baal shgesSey 2% 
beyleve, aay srmonotaekon a . en ff 
a] 
- - 7 7 
ody os BoBt ORL 
ee ree 
guinterd « to ae ute wa 
- 


43. 


Apparatus 


The basic equipment consisted of a modified Skinner Box 
equipped with a means of delivering liquid reinforcement (Lehigh 
Valley dipper). Two retracting bars (Lehigh Valley) were located 
three inches on either side of the dipper and three inches from 
the floor of the box. Approximately 10 grams force was required 
to depress a lever fully. The dipper dispensed .02 ml. of sucrose 
solution. The test apparatus was made of wood and had overall 
dimensions of 12 X 12 X 12 inches. The inside of the apparatus 
was painted a flat grey and was illuminated by two five-watt 
pulbs centrally located at both ends and two inches from the top 
of the box. The modified Skinner Box was placed in an insulated 
test chamber. The Ss could be observed through one-way glass 


located in the lid of the insulated test chamber. 


All E-controlled events, i.e., the sequence of reinforced 
response alternatives and intertrial interval, were operated by 
an automatic programming device. All responses and intertrial 
latencies were autamatically recorded by a 4-pen ink recorder and 


Hunter Klock-counter, respectively. 
Procedure 


Animals were housed for 10 days in individual cages on an 
ad libitum diet. The animals were maintained on 22-hour food 
deprivation after the tenth day. Each animal.was then gentled 


for 10 minutes a day for the next seven days. On the eighteenth 


- 


<8 twaridR hokitbor « to baseieton donlage 
disint). tuatmors iabet Dinpal , piaahinaniia TO sehen ze 
badtool ssew (zal ley yitsd) stag Qgoutoetser owt <h 


: j } ‘ H a . - _ a 
mort ssitonl sagiudt fas t2qgdh SUP To SDS Se 


beviupea; saw 5o30T emirte Of Yhacsaieo ng 


~~ a ! wad Wy 
454) ic 14 Ly i. t: a 7-3 oat) 4 ee waa lt Sies 
e +e 
filerevc ben pms. boow to shame &8w 20 
aunt eset “is Fo Rierer srt, & Hii — 


Gos sh? asm? satu wi Dow alco fot 78 hea 


Sn F. - 7 .. ‘ . . P a | Pa ’ 
P _ "| + ’ 4 1 ry 
hotealeedr ch tl Deesid 6aW a SQ iA 


ceafin -uwesyite impos DSevicsto.od Rive ag ot 
1% a 


beototaiar to Ssomeupun st 4.9.4. {SS pe 

i Sedersge siew p Levey inl Lecaeial 

tabvuadat bas sesoges: LIA .oolysi mua 
bie +9br0der, ant seq~l Boyd arose vee Sema 


Oe Ae 
Ylevisdosqess 2mm 


ma do cegso Leubivibet 


“4 


boo? ~uafeneeS ire) = 


— st Domi a 


Li, 


day the animals were placed in the experimental box, with both 
bars retracted, for 15 minutes for acclimitization purposes, and 
were then returned to their home cages and fed 20 minutes later. 
On this day the water bottle was removed from their cage and was 
replaced with another bottle containing 20 ozs. of 36% sucrose 
solution. The sucrose solution was removed after it had been 
consumed and the water bottles were replaced. This pre-experience 
with the sucrose solution was found to be necessary to facilitate 
pretraining as the animals did not seem to immediately prefer the 


sucrose solution upon first tasting it. 


Magazine Training 


The nineteenth and twentieth days consisted of magazine 
training in which the dipper click (noise made by the dipper 
when operating) was introduced and consequently followed by 
reinforcement. Twenty reinforcements were given on the first 
day. After twenty reinforcements the "click" then operated as 
discriminative stimulus which allowed the experimenter to shape 
alternation behavior, from one side of the box to the other, for 
twenty reinforced alternations. The bars were in the retracted 


position during all magazine training. 


Bar Training 


Day 1: Bar training consisted of "one-bar" training in which 
the bars were randomly presented one at a time for twenty reinforced 


responses. 


ited Miia .«ot Letiamncreges ost al 


bie ,eseog*stay it. 


va 


fotatbomiat OF ssa FOR aie ‘sdalectines ost ae i 


yi bayotfel vf rautp eee’ Dis De aati we ‘a 


bsoro lito \witswi’ 20D ssn 


> 7 


a 


svittotions ce 


« rs) 


re 
he) bce sagen, Bra sued ; 
*) —_ SS 


+ mvt boveaer aan end tetew ait 


Le he comet ee Sn snotows ie 
-Losalqet syay ‘paldies ta) aw watt 
Leeper 20 od Saket neat eaeutos aot Ou 7 


Prtigtee 
= 


ak 1 aitbyaeir Sunt fogy amok SLO 


oot tare Bye cbs tote ige 4 
bent o@boa}) Yo bib — sicht os 


ravby ota ahaa 
Bee wit atrcaiorotmten X 
<3qKa a) bee sc Ra pitt 


xod as eae 


De 


Day 2: The Ss were given 10 consecutive reinforcements 
with only one bar protruding. After the tenth bar press, 
reinforcement on this bar was discontinued and the other bar 
was released. The former bar was retracted if the S shifted, or 
at the end of four more non-reinforced bar presses. The Ss 
then received 10 consecutive reinforcements on the protruding bar 
and were reversed twice more in the same way, thus receiving 


forty reinforced trials in one session. 


Day 3: The same as Day 2 except that the first 10 trials 


started with the opposite bar to that used in Day 2. 


Day 4: Consisted of 20 reinforced trials to each bar with 
one bar protruding on the first ten trials and with both bars 
protruding on the last ten trials. Reinforcement was reversed to 


the other bar on the 21st trial. 


Day 5: The same as Day 4 except that the initial trials 


started on the opposite bar. 


Day 6: Consisted of both bars protruding with only "one 
par" being reinforced for 4O trials. Both bars retracted after 


a bar press and were released five seconds later. 


During Days 1 to 5 the apparatus was manually operated. 
From Day 6 on the apparatus was automatically controlled by a 


programmed tape unit. 


-_ 7 7 Me | eo 


Qu 


7. 
Vm 
Pe} 


4 


aes 


adieteniwtcis: ev iswneanor 0 ab ig ae 
,saeny Yad Ath) Say  3ayeA nico ee kee 
et) wes <ztt Sern boltittoosa lh 2a “ad on 


4 


wed atabirriong sit ac dns Stites ania ba 
lyieoey Bodd . vey aia OLE cet OR agk aie peers 
.aefease sna) ie titel bao" 1 


- ° he 


nigiad C1 tavii sed ceckh ¢yeokn, Soyeeomm sate i, 4€ Poste 


i 
[ 


Ppa! Mais ie gAched 2 Binge set dite s DS cat 


ct esd fa tS on yb 


: -_ — 2 a ' 
34 Betyays : snsote ution ie ised see geass The 
; ri 


+ ' “3 - * + : 
é = * = 
Po} ae abe a eS ees tO sl 


4 


veatla bolosive: ausd iteod” 


.tezel andes, 


16. 


Reversal Training 


The Ss were run 4O trials per day receiving reinforcement 
on one bar only. The following day the reinforcement was reversed 
to the other bar for 40 trials. Reversal training continued 
until the S reached a criterion of no more than five errors per 
day with the last 9 out of 10 responses correct for two consecutive 
days. After the criterion was attained, the Ss were assigned 


at random to one of the five experimental groups. 


Probability-Reversal Training 


The S was placed in the test apparatus in which both bars 
were in the retracted position and after the compartment door 
was closed the bars were released. A noncorrection procedure was 
used in the present experiment, i.e., both bars would retract 
after the S pressed either one of the bars. The bars remained 
retracted for 5 seconds. A trial consisted of the time from when 
it took the S to depress one of the bars (which resulted in 
reinforcement or non-reinforcement), until the releasing of both 
bars which initiated the beginning of the next trial. The session 
was terminated after 40 trials and the S was returned to its home 
cage for feeding 20 minutes later. Each S was run through each 
training stage in a consecutive sequence from Day 1 of gentling 


to the final day of probability-reversal training. 


tw 


isod to gateeetoy suit ocala Coomomtatoraae 1 


eiSist cSw Aomecrcinio: wise F intieotinet aT «4 


aos at Eeter deem alt Aa Sarees 


trisps: iB Sy sinbie OY aps vow - * f 


beck aeLwleir Lease res M402 med 


ar 4s 
a oe ayP) s@dd s1oe Gp Sp-MeRigi ies = sau aa it nar 


-<timaance “wl <9} justins esakgee G0 IO tho tial oft 


boca its: stow 86 ott _Demkemia aa oirstits of % 
alge 
a -~ 
sitcom Latasmbisexe ortt odd Ts enc os 
t8a GJ Sis f =orsiegge wae 
“oot dzemhiegeD9 » St agate Soe aolpiaog bossa i. 


WibSsoig oltoswroonot A sbetealet gasv ewwd & . 


i 


toowia:r Siocw. atsc, cied ..aek ePIC na S 


—— 7 eo a a — ‘ 
borcaget 2twed os 214 ad? to eno tei le bans 


s d e - 
siw moth ami+ oc} to, batarngoe Jeiys A ,chegoee - 
V4 


ni betives: dobiw) ecad ei; Io.she coors 


. ; cay ee He 
d edd o¢ Souter! ay 3-5 suid, Lenet Oi- 


28s ‘ptgtotd — caw re 


aitisces 


LPs 


Results 


Reversal Performance 


The unit of measurement was the number of Ay responses for 
each successive reversal. The response asymptotes at the end of 
reversal pretraining for the five probability groups were .90, 


91, .90, .9, and .W. 


The mean probability of AD responses over all trials for 
reversals 1 through 30 in blocks of two reversals is plotted 
in Fig. 1. To test the significance of a difference between 
the obtained curves and their predicted matching asymptotes for 
each treatment group, means for each individual S were calculated 
over the last 8 reversals and compared with the predicted mean 
asymptotic value (matching) for each respective group. Significant 
t's were found in the 1.00 and .60 groups (p ¢ .05), while the 
.90, .80, and .70 groups were not significantly different from 
their predicted mean asymptotic values (see Table 1). In general 
there seamed to be a tendency for the groups to under-match or 
approximately match their respective probability schedules of 


reinforcement. 


Table 2 presents a summary of the trend analysis of variance 
of the means shown in Fig. 1. It can be seen from Table 2 that 
there was a significant difference between group probability effect 
(p ¢.005). The general effect was for the probability of A, to 


increase with an increase in E, events (Fig. 1). The trend analysis 


1o% Bsaangeuy zh in ‘Lecamci. antl. Gee dats a aes OT) % Shey ©: 
% fac ai; de patotgsyae Sameer al ise ani 


y berras a i, 


Oo, wesw aqeotm _Pbiddedosg- omit? ant’ tot 


ee Rat Re! AAR i 


a 


stains is ‘reve setaggeay 4 tc Asm ol oT 


a o 
‘A at 
-_ 


ba titala i aleeyeavet ows Io” sAbo Ld an Ne Aavuns 


:  - 
a 

a oe er f ; 

sowed soneTSttib 4 To gunacl tome Siz fae L 


eaucionyes onivosem bototbe:rq +kedt Sih. avi fhe peat 
Bavaeliofias opi St faubty thal mse: 1oT SFLE a <i cron nena . 

F pe — 
qsem: batoibdeaig 6x ds tw betsqao: Saas idee Pr u vw 


uso Tf tc goers @vitfesgqesi fos9 tex inl ae ao 
- 


aid atte .(20, > @) agents OO. fick OO aL seth a 


(ert gasusTith vssaitiigie dom sn au 


latsaem Al .(¢L abel sen) walthay ossei va 
1c fovsit-sebiy of aquou, sad seh yang tases 


ac swiubaios a ih 


oss 


a : - = oe 
tositts yiii tdatorg cad Suwie: 


om t* te EET A = 


sleylans bires? ott . Ab sat . i) ) asaeve 


By 


"Of O49 T 
STesteAsI JOST sTelI4 TTe ‘zaao sosuodgaz Ty JO osequeored weay °*T ‘8t¥q 
STeSisAsy OM], JO yootg 


SE oft EE: <a a a: ae a ee an ee 


C a 


: Ree SRE Mess bee ad 


4 
ji oe ) 


08° 


sesuodsey Ly go £4 7 Tqaeqorg 


06° 
OS aI Ney aot 
etre  ——__—_—_—_-1¥9 
08° of ETE Bo eg 
06° eae a ee a 


OG tl ae oe / 


Table 1. 


Means of individual S over last 8 reversals 


and t of difference from predicted mean. 


Subjects 


Probability Groups 


coe een eee) 


Table 2. 


19. 


Summary of the analysis of the trends of A, responses 


Sources of Variance 


Probabilities e sen 
| Error (1) 35.12 
Reversals ma ee Wf 
Linear 86.82 
Quadratic Aa 
Probability x Reversals 
Prob. x Rev. (linear) ny te: 
Prob. x Rev. (quadratic) 3.83 
Prob. (linear) x Rev. (linear) 56.54 
Prob. (quadratic) x Rev. 
(linear) 288.48 
Error (2) 9.31 


Significance Level 
FD &, 205 
Ep < .005 


Mean Square 


& sida? 
sage 


é 
ans 


epzttogzst (A Xo ebodxt omg to eloylens aud % 


20. 


for successive reversals was significant (p ¢.05). The trend 
analysis resulted in a significant linear effect (p ¢ .005) 

and a nonsignificant quadratic effect for ‘successive reversals. 
This means that the trend of the overall reversal means was 
essentially linear and there was no evidence of significant 
curvature. The general Probability X Reversals interaction was 
nonsignificant, but when broken down into linear and quadratic 
components, a significant linear effect was found (p Z .005). 
This means that the curves differed in linear slope but not in 
the amount of curvature. The Linear X Linear and Quadratic X 
Linear Components of the Probability X Reversals interaction 
were both significant at the p ¢ .005 level, indicating that the 
curves showed a significant change in slope as the probabilities 
increased from .60 to .80, with the .90 and 1.00 groups levelling 


off somewhat, resulting in a significant curvature effect. 


Significance of differences between last 8 reversals was 
tested with Duncan's New Multiple Range Test (Edwards, 1960, 
pp. 136-140) to detect differences between the final asymptotic 
response levels of the 5 probability groups (Table 3). The 
differences between means are summarized by the underscoring 
at the bottom of Table 3. Any means underscored by the same 
lines do not differ significantly (p ¢ .05). It can be seen 
from Table 3 that no differences occurred between the 1.00 and .9 
groups; and .90 and .80 groups. Groups 1.00, .90, and .80 all 


differed significantly from groups .70 and .60. 


a } i in » 
brass 6d. «(eve > smn BE 


- 


(200. Sq) dogPis nese pbt a 2 ni bess 
aigetsvot ev inessoce tot toes oeaaReup jews itegtaaon 2 


etieiieds 


sew seaei® Leevever Jerse eft 7 Breat oft todd aneos 


(= 


fosottimgie Te soush ive da daw geedd bee snonkl 
a2W tolopredr 2lgerevel Kyo ileitord tevensg si? .! 


aiverberp hes waenkl ote.b eyed cedlbud asc cud .dasolrig) 


‘ . a ie 7 
.( 800. Sg) Beeld new toads tapnel Sasottingle 2 (aoe ee. 
a ” 
: ue = oii? 5 
si ton Jud #q0le tapaell se beosb sews odd Samy Gee? Cf 
oe ae 


ag tk : ab...) 

53a bab) bes tsent] «A saerhl sf. Suseviuis to TOL 
; “~ penis =a 

: 7 ne a 

Noktosisttt alparsyol * vwiLitdado: soy IO esashogne® 


om gadis anitesibat ,fevei 790, > ¢ sae Ie sateen a 
fay 


“se = eet ooypey 4 oo oot - - 7 — 
Fos) LL9 outs Vik Ges ls = ©, | if 8 &Ss nivismeet . Sas 


eew aLesveves & deal necwisd tsone celta Ip FORB9 


-O@L ,abuswht) desl aginst aiqitiuM well e’ neon a fat 


ot 
ae 


aiktooereicun sas od gidaae OTR smeom & 


see sit yd bsmbeehal mig. “<4 


' 
aeea,. ad 05.92. «U0. > q) 
a 


©. Soe 00.5 ond meittesd SS theose ae 


on 
1308, bie 0es * Ober. ae" <a | 


Fi 7 


gente 


7 
ve p rn a: 
— 


al 
Table 3. 
Duncan's New Multiple Range Test on the last 8 reversals. 
Probability Groups 
.60 2D . 80 . 90 1.00 


means 22.71 26536 31.96 35.42 36.83 


Any two treatment means not underscored by the same 
Live are signiticantly different. 


Any two treatment means underscored by the same line 
are not significantly different. 


jer aorta 
Table 4. 


Summary of the Frequency Plot of individual Ss A, responses over 
the last 10 reversals 


1) Matching--------- 15% 
2) Within 4%-------- 254, 
P= 1.00.3) Within 6%-------- 52% 
4) Within 8%-------- 63% 
1) Matching--------- 10% 
2) Within 4%-------- 20%, 
P= .90 3) Within 6%-------- Lo, 
4) Within 8%-------- 62%, 
1) Matching--------- 15% 
2) Within 4%-------- 38% 
Bienes 8093) Within, 6p-------- 47% 
4) Within 8%-------- 57h 
1) Matching--------- 10% 
2) Within 4%-------- 22% 
P= .70 3) Within 6%-------- 33% 
4) Within 86-------- 37% 
1) Matching--------- 5% 
2) Within 4%-------- OH 
P= .60 3) Within 6%-------- 17% 
4) Within 84-------- 204, 


ee. & 


Ds, an 


he 


7 a -_. 7 


a 


ge, ee 


mlastever &. beak ods no seoT gyal alg htiit: Vt "8s < 


aquon) yilidagotd 
00.1 R. od. oy. 
A ae Sb.ge BRtk” 


PF 98 


on 


anes alt yd bexcosteias tom acmhom coomsessd owt 
Saowsttih yiimsoPiingts ots 


ent! amse odt.yd Hsxoseiebaw ansom siianctastd ows 
witeretibp yitneotiingte Jon | 


— 2 . = matt: he 
sovo esunogest A 2d Canbiviiak to dot yansupent, sit To. PIS mMIG 


doysvet OL Saal saz 


nd Dn tae 


eee 5) aLeawW ¢ ‘ 
ede --++---98 oboe 


Se 


7 


Ce eae 


» 
ia 
» 
e 


- _ a oo - ae > 
ver Perk 


ay il 


hd 
_— he) 
(v= 
' 
" 
A 


Jai 


iP e--2 --s-0 x eetat iw Cl 


, ey 


; 
ew : 7 
Ea: anes 
aa ve 
- 
‘len t 
A can 7 
acekt: : - 


‘ 
= 

a 
- 


ams 


To demonstrate the effect of the different probabilities 
on the shape of the learning curve, the mean probabilities over 
trials (within days) were plotted for the five groups and are 
shown in Figs. 2, 3, and 4 for blocks of reversals 1 to 5, 

13 to 17, and 26 to 30, respectively. Trials 1, 2, 3, 4, and 5 
are plotted individually and the remaining 35 trials are grouped 
in blocks of five trials. In general the final asymptotic 
response levels of the five curves tended to resemble matching 
as predicted by Estes’ statistical learning theory. The 1.00, 
.70, and..60 groups consistently under-matched their predicted 

. asymptotes over the reversal series, while the .90 group tended 


to over-match its predicted asymptotic level. 


A liberal interpretation of the curves would support Estes' 
prediction, but it does not support his predictions of response 
levels at the beginning of each reversal. In the present study, 
statistical learning theory would predict initial response 
probability (Po) at the beginning of each reversal to equal 1-Pa 
where Pa equals the asymptotic response level of the previous 
session. If the asymptotic level equalled .80, then the S when 
reversed should manifest a Po = .20. To demonstrate this relation- 
ship, the Po to the previous day's final asymptote, the predicted 
Po's and observed Po's have been plotted for the 5 groups and 
are presented in graph forms in Figs. 5, 6, and,7. In reversals 
1 to 5 there is an obvious discrepancy between predicted and 
observed Po's for all five groups. In reversals 13 to l1/ a large 


difference exists in the 1.00, .90, and..80 groups with the. .60 


"8 
obo EL ake fon sce Veep add 3 ya-'ghs 


70 as Ld eqnsa Cast wiht dal aurieas 0 — 
a2 bes aquess evil ante beasate pe ‘Cayeb pny 


“¢ 1 slsevever So Basel te? 4 baa .f eo epi tk 


alt _ 


_ 


cps .4 6 ,S.,0 ele? Aeteiegset OF ov 68 nan Mv sis 


ao Coirs 


é 7 
baqiers : alalt* if patalisiee cuit -bes yiliaublyibat bes, 


Jisotqeyss Ledtil of) fevers .alalr? avid wt 


4 


a 
éo- 4ldimmas»y 6) Pobhet-esenia Gv? acl? To aswel ae 
=. 


x ait .yoeen? gitiedh feolgaivate ‘sote < ww oe bs R,, 


batutheug (fact dedodie wer Ybtaeteiaos aque Gia ae 
ee 


gadall supa hiww earike sm So wldadeirds Mik. 
ss vdeo: £6 atolls iieny eff FeongNs For eth a 
ybute tesa sis a. fee-Syar dese ik mtt nota’ 

saicdaer Laliin’ to lier yaeed? pile 


7 


fi-f£ lesoe od Lesvevet eogo Te eepeciget adit Ja {i 4) ue 


scien aid Te Igvel eenddaien oliotgayne ont elaag 
Saia ort en 08. follaug a, feypL£ oivetguyss ait 


ts 


no Wrats: cinot soa (ParRomoly Pe , ‘Og¢ = el a sha Eemes _ 

hstozbengy sik , otonlerites iv Leni arab spre pitt « a 

bes seinim 7 sag wo enti footie ir: ~ " ae 

r t : ’ Sat my) " 
sfengeyss of rane 2 Ri. of git oft sity 

hfe. bei os oSTg sont 


Soul a \i G3 ef pte er Rh eww 


} 


Gen, . onlity aithites aqui 
oad 


Probability of A 


Probability of AL 


vu 


Probability of A 


| ee 
60 | a Poe va 


123 4 5 6-0 W-I5 16-20 21-25 26-30 31-35 36-40 


Mean probabilities within days averaged over 
reversals 


* 
a 
-— 
=e 
ee 
-_ 


~~) wee 
iss 


yer coh ills 


im, 
~ p—— “a ” a 
At, $ 7 
i sa ee —— a oo ae 
b- 
c. tn, : 
, a . 
J 
Ps 
. 
a : 
. = * 
7 “A 
2 
_ a — -, 
o Pa _ = 
—/ 
= a _ = a 
“Se Pp 
i y ime. " 
~ * a, 
4 ‘ 
i . 
a - gy ~% 
\ , ” 
x — —« © of * # 


| oopmerrset 
es ap eee as BS err: 


7 


as rors 


1e 


7 @ 


Po 


1.00 9 . 80 Pade, .60 


Fo 


1.00 .90 .80 ~70 .60 


BIgin6 Probability Groups 


Po 


1.00 -V . 80 -70 .60 
Fig. 7 Probability Groups 
Mean predicted and observed Po values for the 
five probability groups. 


ue 
— @ 
~ 
—_ *. 
a a 
~ 
Leal 
= « 
se 


mae) TEGAN 


ee 


group showing a smaller difference and the .70 group showing 

very little difference. The differences in the last block of 
reversals tends to approximate the differences reflected in the first 
block of reversals where the smallest difference was about 20 per cent. 
Although no appropriate statistical analysis was performed, the data 
Strongly suggested that the initial response probabilities (Po), 

as depicted by the curves were not congruent with Estes' prediction 

of Po. As can be seen in Fig. 6 discrepancies ranged as large 


as 68% for the 1.00 and .90 groups in reversal 13 to 17. 


Comparison of the mean AL curves in Fig. 1 with the within 
reversal curves of Figs. 2, 3, and.4 prompted an inspection of each 
S's A, response level. To demonstrate what proportion of the Ss 
were actually under-matching, matching or over-matching, total Ay 
responses for each S within the different groups were plotted in 
a frequency distribution for each of the last 10 reversal sessions. 
The values presented in Table 4 are the percentage of Ss that 
manifested matching or that were within 4, 6,.and 8 per cent on 
either side of the matching asymptote. Examination of Table 4 
clearly reveals that the highest percentage of Ss manifesting 
matching behavior was only 15 per cent. Furthermore, that increasing 
the interval to 8 per cent on either side of the matching value only 
accounted for 63, 62, 57, 37, and 22 percent of the Ss for the 1.00, 
90, .80,..70,:and .60 groups respectively. In terms of mean response 
levels the individual Ss were not matching their respective probability 
reinforcement schedules, as predicted by statistical learning theory. 


These results clearly suggest that the tendency for the curves of 


At ; 7 | 
: 2 "  @ - a =) 
is oer soctersSTSb aetatie, a yaawon 7 


a. 
45 Apotd #s8L add a: spnomveteRh sit . SoS S TEER | 
a ¥ Z 

garettéh add siamixoniy ov ogh, 2Lea i?" 


a: 
Liem oft saotiv steenwbey 4 


ro) 
yy 


~< . : n : 
: t OS tune saw sonerstato 


La 
te F " ts} rout a | Low 2 Lay wale ; ‘ oS De 4 ttt y seein 5 on ff phe is of 


?) said ‘edevo sancaes: Jeittat ait Fed? bevuayg 
It i . 

; aa 

Partena. ‘Ssalind j i dueweos gon stew sav INs afi ya bars. qSb | 


‘a- 
a - 


: ao 
vts( 42 Pepcet —meeres. iho ,REY wi. tose ed es eA aN iO 


CT 2a (F lansaVvoyx t agGifors (i. Bits oot odd ieee: a 


a Fad is ste { yt at caysao, 4 cee sot & neta : 


ry, 
ignes “to so bLfoeqentl oa Oeste , DES et «3 act ~~ sev | fe ove" r 


“sane 2a ‘toutomabd of .ftawed aa Avg 


— 


A deso? sidotaie weve to Ballotam -aases Jone reba 9 - top ar cs 


<< 
¢ 


‘ 
i badtefg stew squnita Jnet9aTe an), 41 itiw @ doa mene 
anoieges [arrsysd OL geal oc7 19 Gyee Tes mobs ‘cameate tulaly 
+i 52 36 Subideoisd 2a ot + del oi bednepetg seen 


ac) ¢aso w9oq 8 Baad 4 Sistas Stew Veor TO — padee 

i atta? to, noijaniosss . tarqiyee wt idotem eit ‘to at a’ 
meiitegtionn 28 to sgetnee oseq Ceedain edt tads aloe 
Jsestroal tand .stomreitavt ai se Shag yao asw TO. 
Vino siLor Soiifot em ai. he ahha. ‘ota ul ioe 18g ) 
hE aft tot 2S sav % 5: fuss 
soe ison ‘to, Shred oe a Lay Ltoor 

} fotq bd 2sqeae aha a ; 
yest s golimbss exis Jabra Pikes: 


20 shynle ‘ait m1 3 garsbs Ta. =. 
Salen hae 
| P 


Zlee 


Fig. 1 to match their respective schedules of reinforcement was 
an obvious artifact due to the averaging of individual ss! Ay 


responses to attain group means. 


Error Analysis 


The error analysis was performed in an attempt to determine 
&@ possible mechanism or mechanisms responsible for the Ss" behavior. 
It was hoped that the analysis would shed some light on the 
discrepancies between predicted and observed Po values. The data 
were analyzed to determine if the groups differed with respect to 
the number of consecutive unrewarded A, responses made at the beginning 
of each session. According to statistical learning theory the five 
groups should differ in their Po values. These should differentially 
affect the number of consecutive unrewarded Aj responses at the 
beginning of a session. The mean number of consecutive unrewarded 
responses made to the Ay bar are represented in Table 5. No 
Significant t's were found between the five groups. The results 
suggested that a common mechanism may have been operating independent 


of the probability schedules of reinforcement. 


Inspection of the response records suggested that a "win-stay, 
lose-shift" strategy or mechanism may have been operating during 
the early trials of the reversal sessions. Im the following analysis 
of the data the term "strategy" is used only to represent the 
possible response patterns to the reinforcing events. A win-stay, 


lose-shift strategy operates on a recency basis, i.e., the S's 


- Of 


efi toa S tals. to niet pips, 4 aay, 4 


? ‘% leubivibnk Je uke eit ied a 


= 
titstean of tqtets 48 sf banioired = 
& b@ 
oy ee 
.To A Atl t td ih a i lznog ry P xe Ett eris.aitr TO, Om: - Jr in bs 


iit go Pie hl smoa Baile oivow u oe angie es 


Pes ae 4 ah Pw wot \. +th SOlaY TS Ae % - eyes 
wae 4 r : 

Tee ee oe ee ee aohaaend 
og atbbtall Aa: i = a0 

; : = 

vitstinee tits aipede saartl.sbbe, oF made oe <92thb biveds ; quo7 

; a ‘ae 

of ¢@ asancgger Gh Lebrewewit svbti eee at I99TT 

J < S Shy 33925 
c ahi ee a i 
Bb ubaswertis ~Lideesaes radiim rHeém our ey € 


.( aioe? it batttesonget, sis ad aes 


tives; eff  ,agvorm sett are awsuted wi 


4 
“7 
- 
o~ 
} 
\ 
tr 
i. 
. 
- 
“ 
bth 
= 
i’ 


nuineqoodl saitereasd. ead aad ye oe 


.¥ate-atw” 2 tad? bate sppiain pais ounnioge e 
fete, a 


ac bey arttersqo Abad awed 


% 

aicyiaene sbwolto? Pre, 
, , 

Thy Jap aeagsI Od: iLio 


wie 
| : 
a ‘ods 2 Sen, 


Table 5 


Mean consecutive unrewarded responses before 


shifting to Ay 


Probability Groups 


unrewarded responses 


Table 6 


Mean per cent number of times S used a W-S, L-S 
strategy in blocks of ten trials 


3 
a vv 
= 1.00 13 - 17 
26:-= 30 
1-5 


26 - 30 

oS 

= .80 13 =. 17 
26 - 30 

tes 
= ./0 i he eel og 
26 - 30 

i Seca, 

at 60 13+ 17 


ad 7 
Hosi? té soduxny ¢neo.1aq ASP C2 
 eabaid ni yaar 


28. 


behavior is a function of whether reinforcement or nonreinforcement 
occurred on only the previous trial. The following analysis 
attempted to isolate this effect, although clearly it does not 
imply that only this strategy was operating. In the present 
experiment there were 8 possible response patterns to the Ej and 
E, events, four of which would be indicative of this strategy. 

The four pertinent strategies (marked with asterisks) are illustrated 
in Fig. 8. The percentage of trials on which the Ss manifested a 
win-stay, lose-shift strategy is presented in Table 6 in blocks 

of ten trials for reversal blocks 1 to 5, 13 to 17, and 26 to 30. 
From this table it is obvious that the 1.00, .90,. and .80 groups 
were predominantly using this strategy throughout the entire 
probability-reversal training. Although the .70 and .60 groups 
were lower than the above three groups, they still consistently 
manifested this strategy over 50% of the time. In general, then, 
the data seemed to support a win-stay, lose-shift strategy. At 

the same time it is also obvious that as the probability of Eo 
events (1-P) increased, there was a decrease in the number of times 


the Ss manifested this response pattern. 


To demonstrate this effect the E, and E,, events of the win-stay, 
lose-shift strategy were analyzed separately, and are presented in 
Table 7 and 8 and Figs. 9 and 10. For Ey events, Table 7 and Figure 9, 
it can be seen quite clearly that there is an obvious relationship 
between the probability reinforcement schedule and the percentage 


of times the Ss manifested a win-stay, lose-shift strategy. The analysis 


tnemsieMiketest Ww scmliiacatia ee seh : 
steylens yaiwoltc’ eff ae dugivent sift ‘uy “o 

.b tt yfaseio wetole «tent apd : sabtowt OF 
fusagtg att cl Jaieidieas “SRR esibrte sit giao 

nw of sd od sunstyeq- eelteer elmiieaag 3 suew siodt dns d 


yaterta aif} te evidseiiab of Biow doistw to TOT 4 
baveweuLl ¥ Aa bedtae Meine f Stiabnant) epigetaiss ere 


2 betootinse ab ere debi ao eiela? to syecmeotag ott 8 ait at 


4 > = 
raseid al & sfdad itt bottesoag ef qasomwis +tide-oaod .yeta-n 
hee ds: 
fF of Oo Bee ,” t ©L .¢ of £ wtisetd dasuevsi 20t eater 
a, Bs OO. L ett sede atoebede ar st otdad 4 4 


quem OS, £ me. 0.L 4 coed 


1i3te OM? tuodagetst yzgetetve: efi? potag craeamaten 2 


atriot =: Oc. fa on? datioftLA gathers Leetevetmy ti tdac 


- ° 


yiduetatanes Libte yads .equerte detiy ovode sit madt te 


fedtt .Leterge Al .osth? adi Go alc tae vaniens ald bot gle 
| x 4 * bs a be 
cA .yuetetie Jtite-sebt, yaa ney 6 sroqaswe oF Be 


S "to vwhbiitiatoke ect ed Dany seeiivdo cals at oh a 


amit ‘tO tsdionitr sad ath Ss —— ¢ sew ares esaecont wih 


condition seen else 


,(ega-cniw sad To eames ca Bae e aa reat etd texto = 
ib bsdnossuy Sree cuttin bag peeiinon 3 , mw ! 
o orepit bea ? oleae <aiares eae 02 Bae 2.45 
a Ldano itelor Bscae edo a i Z ct ve ait ‘oa 
snacrshtso oid Bae. sibsitos . 


je = Ps os 
akeayiens soz be rSeoL 


29. 


ss ae Stay * 
Ay -rr--- Shift 
ay 
Ao bea = Mad Sst ay 
Ap ------ Shift * 
Ay ------ Stay 
Ay ------ Shift * 
EK 
. See neh Stay * 
Ap se+e== Shift 


* Win-stay, Lose-shift 


Fig. 8. The 8 possible response alternatives 
to Ey and Ep events. 


7 ; , ‘Eq % } 7 rT ¥ 
at ~ a _ 
& a _ 

gov tweed tle oancyect oldiaeog O one a «get if - 
ST1eVS | ail fare - 52 oF 


Groups 


Mean percént 


Table 7. 


Mean percent of times the Ss manifested a W-S, L-S 
strategy to E, events in blocks of ten trials over 
reversals 1 to 5, 13 to 17, and 26 to 30. 


Reversals 1 to 5. Reversals 13 to 17 Reversals 26 to 30 


1.00 


2 


. 60 


0 f{9- 


.60 


1.00 
~ 
. 80 
70 
¥e6 alt o— —o Perfect W-S, L-S 
e— — — -® P ot Ey 
ew6§ Observed 


. 60 -70 80 .90 1.00 
Probability Groups 


Fig. 9. Mean percent of times Ss manifested a W-S, L-S 
strategy to E, events averaged over all reversals 


9 


| ,2*% & badeo 
Ligi+) ned 


aw oS A 


= ¢ * . 
— _ - _ 
f 
o- —_ ae 
a ea 
» 
_ - A A 
a 
s es | 
—_ — 
al re 
} 
-_ — ae 
= 
a. o_— 
~ 
“ 
~ 
: 
a 
A 
‘ , rT ~, 
« S2attst o-~ 
’ <j 
I] * 


< S Ae 

cel ae" 8 botgo king 
. o 7 Jigs 
cisucevers Ea tevorts 

7 7 % oe 5 Sa 
ie’, > = ae 


7 a oe 


Groups 


Mean Percent 


pile 
Table 8. 


Mean percent of times the Ss manifested a W-S, L-S 
strategy to Ey events in blocks of ten trials over 
toy sco. 17 end! 26) to 30, 


Reversals 1 to 5. Reversals 13 to 1 Reversals 26 to 


aes w om OO — —C Oi le — Cl OOo m—"————"_—XO©COD 
~P 
. 80 
Oo —o Perfect W-S, L-5 
rage 
== 6 PF OL Eo 
.60 —————» Observed 


60 “10 80 .90 1.00 
Probability Groups 


Fig. 10. Mean percent oi times Ss manifested a W-S, L-S strategy 
to Eo events averaged over all reversals. 


sintoe sel at oe ae 


4 6 
oe “3%. ae 


7 


1 ,a-Woe batcstinnn ot atti - ‘ee to ee ys 
vo eleir? ' a@loote va ove <i of wybtamsa. 
of & bos yf o8 ee ~f ov fi : . 
| =" 


rf oe FS eieererent S of £ otéerevet 


<a — ee a’. — 


7 
| 
———— — siantemiiieeinipeilitasl a 
: 
= Sts % 
4 ——S a & 
wn 5 — - « 
> = —/) 
o-— - * 
f 
f 
y 


a : : . : T | : 
O0.L OR « > Oo. 
__ Bae WERE ER tn 


“ 


oe me anaes 


Se 


of the E, events did not support this interpretation. In general, 
the results of the win-stay, lose-shift analysis indicated that the 


Ss "attended" to the E, events and tended to ignore the E, events, 


2 
while, at the same time, their behavior seemed to be closely related 


to their respective probability schedules of reinforcement. 


The mean number of times the © shifted from A, to A, during 


2 
an experimental session was calculated for the 30 reversals. 

The results are presented in Table 9. The median test (with Yates 
Correction) was conducted to test the differences among the five 
groups. The .70 group made significantly more shifts than the 
1.00 and .90 groups, while the .60 group was only significantly 
different from the 1.00 group (p ¢.05). The .80 and .9 groups 
showed no significant differences in shifts from the 1.00 and .60 
groupes. In lieu of the fact that the 1.00 group tended to shift 
4.6 times in a session even though there were no programmed shifts, 
it would seem reasonable to subtract 4.6 to obtain a base or 


operant rate of shifting from the rest of the observed values. 


The corrected values are also presented in Table 7. 


The mean cumulative response latencies within a session 
averaged over the 30 reversals for the five groups are presented 
in Fig. 11 and Table 10. The data was subjected to a trend analysis 
of variance which is presented in Table 11. No significant differences 
were found between reversals. The trend analysis resulted in sig- 


nificant linear, quadratic, and cubic components (p¢ .005) and a 


SE - 


(ry 


: \ & Bag 
edt edit betag boat 2laylege itanagg conn yasa- ake ¢ 
cbthevs oF aris enone habtind Sancdlliors. ote ok 
bsieler yigeois sd ot Benose tokwvaradnted? yomts 
-ditomeo rotate: to selwhstina yielidedorg 
gt A ot ,;A aor? badtide g ont semk? ‘to redogd . 
elaine OF cult seh betelinive eew noltaaee tt 
autal d¢iw) Jeet cetbem e672. .0 el[deT al bedasssig 
wl aff gnome ses ta Thip of5 teed of betoubnos aw (hbk 
ad gadet adtida erom yldnesttingte abew quoxy OT. sd 
vilenotiiagia vito saw quow O°. sft eLirw ,aquomy ae 
iqiicts ©. fae 08. eadT Wed. > @) qwotg UO ot moat : 
08, Bb 00.1 -adt ws atiiee of evotowm Ts aso ioe ba 
itis of behust gies U0.L dt sats soot ad De woke at” aq 
27 hits Bemmerongq on eiew etotit agoodd neve aokenen # a 
no seed & mhatdo of Ou! soartuve of of@idoenet 5 at 
.26etayv bevtesto sat to geo sit sr at 36 ta a 
.j side? nt batmesesq cule ose aoulev be 


a 


Kolasen © oft ky sinned! nine alls cot 
> _.) Bo 
. ry ute te’ tas . r - whe 2gers S 
a 
at i on 


jonas: 


33. 


nonsignificant quartic component. Inspection of Fig. 11 reveals 
that latency was not a simple increasing monotonic function with 


a decrease in the probability of E, events. 


3h. 


Table 9 


Mean number of shifts averaged over all reversals. 


[Programmes | observer | comresiea 


«70 


.60 


Table 10 


Mean cumulative latency within sessions 
averaged over all reversals. 


Probability Groups 


80 70 F.60 


Seconds 


wile | | 


@ sldat 
. Fe {°° Sal) ie Sa ee . _- 
.aleareve: [le seve begeseve av tide to codennt 


OL saldsT 


anotzese atdtiw youssel sv ttatem. ane 
alserevet Lhe «syo benetovs 


UW 
Nea) 


Table 11 


Trend analaysis o:' latency scores. 


Source of Variance df Mean Square F 


A. Reversals 5 41.56 1.59 


B. Latencies 


Linear 1 1520.06 5.83 * 
Quadratic ‘a 2368.05 9.09 * 
Cubic i 2088.60 8.01 * 
Quartic sf 237.74 og1 
Error 20 260.60 
2m Dr e200 

800 

n 

Le) 

§ 700 

oO 

D 
600 


1.00 90 . 80 -70 60 


Fig. 11. Mean curmlative latencies within 
sessions averaged over all reversals. 


}) 
ae 
- my 7 


a... * 
7 i oigetT ts e “a 
af 4 i ee oe 


B — s : ; x -_ 7 
.S9t008 youssel “Wy oleyetlans beort - 


@ 7 
° a YY pe, tb oi iff 2 ay mw OX OC 


: a 
selometml ff 
‘ a 


PE 


a 


a 
* £8. 2 a0 ,OS21 J +o omke 
i 
— 
| UY e 5A . ’ fi “5 Linke ‘| 
x a) s S i x BOS ‘ oboe : @ 
7 
: 7 
‘ ie vy 
é x ,« a 4 A ed ¢ 
| at ( 
oe , OS cw .. 2 
= : 
cs 
pearl eee eee lr tench a om 
. ee a { 
re Sal 
2 a 7 
‘ >_> 
_ 


36. 


Discussion 
The major findings of this study can be summarized as follows: 


(a) Different probability schedules of reinforcement had 
differential effects on the final asymptotic response levels in 


successive probability reversals. 


(b) The final mean asymptotes of the groups appeared to 


match their respective probability schedules of reinforcement. 


(c) Individual Ss did not match their respective probability 


schedules of reinforcement. 


(d) The Po values for each probability group, as predicted 
by statistical learning theory, did not coincide with the observed 


Po values. 


(e) The error analysis revealed: 

(1) That there were no differences between the 
five groups in the number of consecutive 
unrewarded A, responses made at the 
beginning of each session before shifting 
to the A, bar; 

(2) That the groups differed in the percentage 
of times they used a win-stay, lose-shift 
strategy during the reversal sessions; and 

(3) That the Ss "attended" more to the E, events 


than the Eo events. 


:ewolio? sa Bor bcenve So pes | 


bed ‘nomsovo'taley "ty ssbidele ya bl idedongy soarsttet” (a) er 

° “t's “an 

nf ¢level sanogest obvodgayee Ledid ac} no atsette tation th 
'. £2 oa 

-pleatave: yi iitdadow 


ov “berésqge agnesg one ‘tte ¢otodgeres asom Lent? eff a 
.timemotofaicy Gp asiuberios yirilidadciq ewitosqase todd fistan 


yo Liteedorg avitosgiea: theid dotam goo BLb aB Loubbvibel (2) | 
‘ye 

.Jaomorotaist to salut poioa 

: al) 
: : ; psa 7 , fs - 
betolbeiq 64 ,quow yi Eitdedosg dose tot eéxlev of eet () m 


were 
Savsacdo onda ily eiilontoo Jom BL ,yxoed? gutsewel Leolsebiata yd 


belesver ainylame wots ef? fe) ; 
ee TT eee ne ae 
avittoseases to seth sid ah aquony wit : 
od? de sbas sosnogeds PY . 
SoLiiide exted Hodekae ro : - 


SRAINAS TSG 
3 ‘Listes sale ot 


ap — eit 1 
brut uO — 
- 


atdeve = oa “at a . 


rouse 
saris <5 er 
io iy 7 
- car 


37. 


The findings of this study in regard to the final mean 
asymptotic response level do not agree with those of Uhl (1963) 
who trained rats in the same apparatus but in a 2-choice probability 
learning situation for 1,000 trials, Edwards (1961) who trained 
human Ss for 1,000 trials, or Wilson (1960) who trained monkey Ss 
for 1,024 trials. In general, the results do not support the 
"maximization" hypothesis of decision-making theory as supported 
by Davidson, Suppes, and Siegel (1957); Edwards (1954); Siegel 
and Goldstein (1959); and Stanley (1950). The final asymptotic 
levels of the curves in Fig. 1 tended to partially support Estes' 
prediction that the final asymptotic level of Ay responses will 
tend to approach the probability of reinforcement for that event 
occurring. Examination of individual Sis A, response levels clearly 
revealed that the Ss were not matching their respective probability 
schedules. The results do not support the findings of Estes (1954) 
which resulted in individual Ss conforming to the matching prediction 
of statistical learning theory. As emphasized by Anderson and Grant 
(1957, 1958) and Anderson (1959, 1960, and 1962), data when analyzed 
over trials, i.e., averaged response probabilities, can lead to 
erroneous conclusions when testing the validity of statistical 


learning theory. 


In general, the first impression of the curves seemed to indicate 
that something like matching was occurring, but comparison of the 
observed Po values to the predicted Po values revealed that there 


was an obvious discrepancy between the two sets of values. This 


rbo thet ot besos 8s VIS oHit- ‘obi 


7 . 
- >» ; 
ave 

> = 


(230.5) a) “Ot soe seege Seah Ob Level sancqest oh yodqmy a 
Liga 2g satoir-S.* 1) dant Seep amea of? al aves pes: 
J. eo 

oof feat oe (Tot) aivewhd ,edeis? 000,5 sot colvamtie gataueel 

: e eae 

S& yorlotu Bea get ody (OG8L) geRkiW ao ~eLletay 000,4 gad 28 nacurd 


te 


nasi. ipeit sd om biegeE - ets abit to syatbekd s 


a _ 
Sy*s aw 


raqges ton ob eb fvwe: st fareytieg ot .etebal 490, 4 tot 
top oA rood aisle eel ie to sinertogy "Hostess heboom" 
OL) ajaawhG :{ TRL) Doped bas soqqwé: yuoebived 4d 
qawvas Laat? afl. . (Ree) vataedl bua et) mbepepiay bas 

mek rirayaus yilebicer of beined 1 .elt i aovIlp ont to alsvel 
eruyear. »A ‘lo Jsvat otiovqmgee Lanki off deur ae 

ttys vec’ qo. *remebrinelée ‘to qhilivedouy eft doscigga oF baat 
[epoig altwel seaagesy re 2X8 ie/rrhint Yo oo ttenboexd casas 7 
Eideionm avlisoqesay thead galderae cen erew ab om sactoiniisionil. 
IL} £4 7a5. eset ibn? ait Proggve cout ob etivest sd? _ + 8eLubedos 
ofbery giiuoren, wit of gafene iiss a Legbtv iit at Protea dolce 
si) bas ncetebad yd Sestesdgs eA aAgtoadd gaint Leo btelhtese to 
_lsn8 osdy) 2seb 1 { See bi OURe 228R2) secesnniainl bos (Sms = 
hae ose tabi — bagecovs ested. or . 
Laoide tiete ‘ta yg tadlline ane ‘dates ames ‘soeace 


far 


st to roars fare) Dit ke meet iat 


siat sady aahasior Rice 


had se 
oid? ,sealtv te ates ON Snr ee 


a6 


oe, 


38, 


discrepancy between the observed and predicted Po values and 
the generally poor fit of the curves when plotted over trials 
combined with the evidence that few Ss actually manifested 

matching behavior strongly suggests that statistical learning 


theory does not adequately explain the present findings. 


Because. the initial behavior (Po) atithe beginning of each 
reversal session did not concur with statistical learning theory 
expectations, an attempt was made to identify a possible mechanism 
or mechanisms to explain the discrepant results. From statistical 
learning theory the five groups would be expected to differ in 
the number of consecutive unrewarded Ap responses at the beginning 
of a session. No significant differences were found between any of 
the five groups. This suggested that a common mechanism was 
operating independent of the probability schedules of reinforcement, 
at least in the first five trials. This analysis tended to support 
the high Po values and consequently contradicts statistical learning 


theory in regard to its prediction of initial reversal behavior. 


Overall and Brown (1957) pointed out that probability models 
require a minimal amount of past experience upon which prediction 
can be based, consequently, they deleted predictions made for the 
first five responses of each day. Anderson* has also suggested 
that one should consider discarding the first five or even 10 


trials of each day for the same reason as mentioned by Overall and 


* Personal communication 


AL padgutiewi Bit Baviaad ait mated Youngs 


aletst + fe anhe saya eg ‘ko SE stocg ylbete: 
a am yilewtsae #6 teh Gat couebive sit dziw tS 
, eaivaitate taney fama yiacow:. solvated 
PL chess sol ofc citer ylotev,ste gon esoh 4 
Pro Sen \1)  “Sodkye Inttin) o@2 snaeosd 
2 i Siu Tis et 4 BID iit 9oiZga92 ‘teste 
: lgloeor « TtivnsBijot sham sew squeyfe an cen tbaddegte 
fas c as | i’ Sitascy: $5 Sa slgxs OF som Lamson 0 
; ' eloure 
rth oogpd at DF oye sat Sepsis sabnibat 
J 7 4s 7 
j BOCCRET -/ ots ti vesenoo Wh ‘oda sid 
to 
¢ 7 m4 - 
; Brose: inaottingia off eokaase BRO 
f » noniiein a fate pereegges nai er 
7 
4 ; 7 we 
o£ ibe vAPIS 1g shrsgebst gr Loge p 
: ; Pop ete. an 
roa eSnot elavianse Sint .efeks? svit fen Sales wae aa 7 
2 pupiee: «Bie 
a ine. ff hides @7>cbatseeo, yithacr das baa esulev of dyle 
iivadet Leatteves fettset to nchislbew a0) of Sanger ol wroed: 


Ww 
}- 


noljtotbe tg Moot diegn, eam breges Saag” 26 Javon Labeda 


* 
iat es 


a i} 


Dac <1 Se SAS 


Of eve so wit suxht EFaf sani 


boa SLersyo “ 


LOO Viel tdsador1re tedd tuo beraiog VERE} wanid bis Leeceve 


ue ois sae wros 7SbGA 


- Pit 


td 


: 


a 


ro 


i. 

‘Pema 

beanie esl 
i? 


olspthery heselad yect “ittsapeasoe botad wae 


Ts 
Nn 4 a. 


tr 
Sets 

a 
’. 
14 


Wee aeae “to milerinibegie: 


bani ttcssit Bq nossa’ lied — ame 


ell abe — a 
a oo —_ a 
7% eee aaa . 
“T3 ~ 1 se 


ong sts 


39. 


Brown. Also he suggested the Ss are sampling from a stimulus 
population that consists of more than just the stimuli that were 
available for conditioning in the latter part of the previous 
reversal. Examples of these stimuli that were not part of the 
stimulus population during the entire reversal session are as 
follows: (a) internal stimuli (kinesthetic) arising from the 
handling of the Ss when being removed from the cage and placed in 

the apparatus; (b) slight changes in the smell of the test chamber 
due to other Ss; (c) lingering odor of E's hands in the test 

chamber; and (d) internal stimuli arising from exploratory behavior. 
All of these sources could contribute to "extraneous" stimuli 

which were not part of the stimulus population at the end of the 
previous reversal session. It also seems reasonable to hypothesize 
that these extraneous stimuli would probably predominate the stimulus 
population the Ss were sampling from during the early trials of 

each daily session. Assuming that the position preferences of the 
groups would have little or no effect on response choices, extraneous 
stimuli would tend to result in Po values around .50. The above 

may be considered a possible explanation of the observed high Po 


values in the present study. 


If Anderson's, and Overall and Brown's procedure in regard to 
the first five or ten trials is applied to the present findings, 
then the curves in Figs. 2, 3, and 4 would manifest little or no 
negatively accelerated growth, and would have extremely divergent 


Po values. The forms of the acquisition curves and the asymptotic 


euiamits « an? a higuas 4 poeta 25 


gta série tint ass Fawn, BET stom » ate faaco ras, 
supiveiq sit to. daeq sade ait St ieitien istics x02 § 


"nett Sot -ontw tad? LithEte opadd te sokigamlt. Leavis 


A 

| _ 

28 978 sc tees tigunatna stivon eit sagt Bees: b mites ns Lh eas 
ott govt giteits (oiredtasgbly Ghvmite Lerradint (a) 

1t Sooale bas 9965 ott Wont Beaver yiied nonfw 28 anid to 


_ P + r , % tA ec + tts 
raditaio gash oft Yo Een Sit oe peguido skate (dd) 


& 
ot wif ni shasd wt Yo tabo Sn itagott (5) ie 
1Oiveted re rmLaxs ~ 1 otigioe Lfemica fncromt {B) toi 
ilotiive “srpoumetioes” of sétidtticou Bisco sondiom ror Ws a Pr 


= 
. . be a ; : ay? tot a 
af Ic bow sd7, Je dobieluged oolusties edd to Prag sas te Mot el 


a 


Siiagivogy: 6s sliitmoces, G@omee cele #1 .noldeon corey te vonng 
a 


for ae + y i” 2 . ia » . at 
Glincits. eff stennpptiery yhedediug diccw Riunuets soon ‘ea FI ks 


it GLpiec ‘yisea otf? mains aoc? gutigima stow ag ~~ ‘ ; 
oi 


sit to geo teTerq neteieny off deri soteess Ree eee 

ROMs ,espolowio ontingesi an tostte of 1 sis7 EL Go Bae 
- o> 

wwous off /O0. benors esther of di iiweeon of base Ree 


au 


a ; : 
of Jali Seveoedo gay ‘to notteneigne oidteeaqg « betabkamoo sd yam 
‘ ; ; ° - : 


5 a 5 ” 


-ybute Josesrg, Saf at seule 


 Srigen at Stubeoony e"nwoxrd ‘ata LEFILaVO bee al on 
r - Sepncaesy Seasty aus on pertqus oh ateiat sit 1 
on 4 ofthe rica doa it brs sari slanted 


Vosgtevhy 


y « ri = 
= >) ees ; : 
obtoignyea ont Bud save. no uniipas-a eee tetas 
: NO i i hes ate ‘ We fi 3 ir to. Son 
2 7? : 7 - i ry ae ane 


net) . : 


aig 


ia: 


HO, 


response probabilities predicted from statistical learning theory 
were not in good agreement with the data. It was hypothesized 

that the data may be explainable on a recency basis rather than 

a frequency or probability model. In the present study, for the 

Ss to reach criterion in the reversal pretraining, their behavior 
would have had to be determined by the most recent events and 

not by the past events of the previous reversal (frequency). 

The attainment of criterion could be explained by a win-stay, 
lose-shift strategy which operates on a recency basis. If this 
strategy did in fact exist it seems plausible that it could have 
persisted through all of the probability-reversal training. 

Although no adequate statistical analysis could be performed 

on the data, the results presented in Table 4 strongly supported 

the hypothesis that this strategy persisted throughout the 

entire training. Goodnow and Pettigrew (1955) have suggested 

that whatever pre-experimental response tendencies, sets or strategies 
Ss bring to the task may persist throughout the experiment proper. 
Anderson (1960) in studying the effect of first-order conditional 
probability in a two-choice learning situation, found that high and 
low conditional probability sequences produced different acquisition 
behavior. More importantly, this difference was maintained at a 
high level over several hundred transfer trials on a 50:50 random 
sequence common to all conditions. The results of Goodnow and 
Pettigrew (1955), Friedman et al (1960), and Anderson (1960) support 
what Anderson has called "repetition responding’, that is, predicting 


next that event which occurred last. In view of this supporting 


Oi 


ned? ‘gaitivias! Leotteagedt, int pevenhon vasa 
beeteedtogyd asw JT Lad est sy snoimorege po 
sett edits aiaed yotteodt a do oidetiaips of ae 


: a 


suey “ot -ybute. tnonetg off Al =v Ieboa writeable: 075 
voivaded «isu? ~yridbhetseig Daaneyet od? solrsabeo | 


bus acteve tHseo1 tacm edd yf Benimretab of 02 bei 
ad 


.typatipers't) Loarevyex>auotve:q aH? to 2trevs desley 2 


| ag - 
elite 1 .alead youssst & ao eatersao sip lity actinie J 


yade-civ 8 yd bortalgns of Dino nolistixg To: 


Ja Ma) 


ave Blyoo ii dads eldtavely amesu i tadacs Soeh at BE aesert. 


f 
Haare <> q 


mithate<~ 7 © Lesysved ahs Edationg exit Bo re sonal 
berro'tieg sit bios sisylens Taoltaisass aul ieee a 
hbericqge chaos i) sliel al pedneestq So leaat et 
ens tuoteienid bétatateg vases 2ify teat 3 be 
betesygce orgs (800) waryietss brie iubeon 
sipstaiss x6 pts ssatpmabasy aatogasr fosmomt oon 
Tecony tismixvequa sit la ed dataxog "YAiM Asst itt 


Lavors thas ae to. dostts. sit arbi nt” 


6 

. ow 
_ eR a 
; 


bos right tent bitte’ enolase gulttsal geieido-oid 8 E wt 
soittalupsa salen vs Beane 1 
ads aplaer.’ 19 
mobs Ocy0e no @ 


degaua. (Cos om 
gobtoliieng vi a raison 
asithnagios # 


+e 


41, 


evidence it seems reasonable that the initial reversal training 
trials produced a set or mode of responding and that this set 


persisted throughout the entire probability-reversal training. 


and E. events also demonstrated that 


The analysis of the Ey D 


the percentage of time the Ss manifested these strategies was 
monotonically related to the probability schedules of reinforcement. 
Analyzing the BE, and E, events separately also showed that the Ss 
"attended" more to the E, events than the E5 events. More 
specifically there seems to be a linear relationship between the 
probability schedules and the effect of reinforced and nonreinforced 
responses; i.e., as the probability of an E, event increases, the 
effect of a nonreinforced trial in changing behavior (shifting) 
decreases and vice versa. The above results are congruent with 
the findings of Atkinson (1956), Neimark (1956), and Millward 
(1960) on the effect of nonreinforced trials on behavior. Similarly 
it has been hypothesized that a reinforced trial has an enhancement 
effect if it occurs in a series of nonreinforced trials (Estes and 


Burke, 1953). 


The results so far discussed indicate that the behavior of the 
Ss was to a considerable degree being controlled by a recency or 
postremity principle, but at the same time was quite dependent on 
the probability of reinforcement. This is not in line with a statistical 
learning theory interpretation of the data. The analysis of the 


data does, however, strongly suggest that the development of a 


igo isebis ‘e B rihiv nee ah sou 4 it - ae i 


git Wi ~~ & 
Lsevav: se CLAS outine edt Svorguoth si 


Se 


Jéry Ssdarvitscoseb oale BUASVS oh bre 3 ed? to siaylana si 


7 
» 2 - J a 7 
aw sslgavoute spot? fhabeshigem #8 sdz amtd Yo sg 71199" 
-tvemoreiniet ‘lo eo lubsise Yitlidedesy ods at betalet 
2 ent tate Howorls Gate adeaeane saasve A how -G oat 
> A > 


sto! ednevs Pr one mec stnave r aft of 3 fom 


ait psewied qidapoitalsy weegal & od oF ames stand Ma sgt 
beotctintowwn: bas Bagsotaict 9 SosTis ait Bue 2 eelsbgrioa ¥ 


ait ,edeesTIni Jfsv5 ,;2 ne. te ywilidadorw add er en Qe: = ts Hoge 
writ : 


(auittida) riveted getgobda nt ksiet boo ro bebe 1 a ‘to toa’ : 
i pare 
isiw dapurrpaes ote aging svods oat ,.#atev oly Se SaBS1¢ 
~. . 


my by 


boonwilt EM bas . (deer) eamite , (heer) qoan tae te  2af. 
ee 


Visuetiwte ,gobysied ao alert Beototinisine: to tog 


“ve 
taussiedie gs asd feiss beso" inte & tan? bos ineddeey 


bas sated) elaigt beoroiinisraion io, asivee 6 ot “ie 


ods lo tolvated aut Sadr ofeotbak 


Aa» Z os a4 Cie : 
20 Yorqgs7e' & Wh et % or wa 


a0 a er oth 


a 
i 
; aT 
orth aa ‘Ox - 
i 


a : reo ani 
si) Iorazz rr -8tBb ad: Ya 
= - 7 > i 


© ise 


li , 


win-stay, lose-shift strategy occurred during pretraining and 
persisted throughout the 30 reversals. The persistence of this 
strategy would in part account for the incongruities of the curves, 
such as, the high Po values and nonxnegatively accelerated growth 


characteristics. 


The efficacy of this strategy for providing information to 
the Ss in terms of maximizing reinforcements would decrease with 
an increase in the probability of Ep events. In the case of the 
1.00 group the utility of this strategy would be optimum; i.e., 
a reinforced trial always follows a reinforced trial and never 
follows a nonreinforced trial, whereas in the .60 group a reinforced 
trial does not always follow a reinforced trial and can follow a 
nonreinforced trial. If the utility of this strategy is considered 
in regard to the amount of information that is significant in 
acquiring the maximum number of reinforcements; i.e., what bar is 
the pay-off bar today, it should be apparent that the amount of 
information decreases with an increase in Ep events and that the 
probability schedules of reinforcement should interact with this 
strategy in the determination of the A, responses and the final 


asymptotic response level. 


The present study supports Overall and Brown's (1957) view 
that “until frequency and probability theories include a recency 
principle, it appears that they will be neglecting an important 


consideration, as shown by the growing body of data on the relative 


bun grinkeverq gnbuh Sormid6e pevente iar | 


ttt Y% etaadateveq si7 pieeeavor 0E ati ou 
sont ai? To askeiowghoont eff rol tnmoss, Fsge 


tiworm batesaleoua yfevitayer- con bas. dirlev o 


ot sotitanneLit gathiveiy vot vaste shih ae 


itiw saucrosb bivow atnemsorhilet griy 
¢ 

. ‘i 7 e 

add ‘lo 2245 i3 sl - , 8énsva> -3° Ito tbh beta ond ot s#ac 


+S. ? jaamtican 8d Disow Vasiexs ng aa quoTE 
rayon Lite Lely! besrotnier a 2woliek ovale Esky be 
beqiotmis: equerg Od, sé of sssitene _ Saber oonoteme a wot 
3 
& wolia? 89 teak Leivi MAVINIST & wouter 2 mia tom: 96 
Th Biecltinnia ei Jads aos perdtcs tak is tei 0 


ai ing [Saw , «9.4 2adcnsmoote =Nlst 2G 30h eine 


fone brit ni _ ee es 
23, Set? Onk evasve .d ni stastone ra EW vt 
; : ae 
Saal {+ 7 + trek fF | @ de. —_ 
enid Gi tw fobtetnt Olifods Joemeowe ts ares 


fanc) o07 BHA esanegear -A sd. to 
shew 


wely (Y8CL) aloword brie Hagev at 
Yoneos: 2 oe iA ee 
- 


cos Stogmk me see fgon ad. 
svitaiorx ait 5 


43, 


importance of recent events in the learning sequence." An overview 
of the findings of this study points out that analysis of sequential 
dependencies or specific response patterns, such as win-stay, lose- 
shift strategy, are probably more effective in the description and 
analysis of behavior than using the mean asymptotic values of the 


learning curves. 


Ee 


walvenyo. cA ”. sonemper ‘ Jaare saa09t to 
wolvenye.ah snp eteiand, ott at 28 aad 
Jaitrenpes to. sieyions dart duo. etakog b cas alice to sgaiher, E Rye 
-svol ,yste~miw 26 dows ,eoreddag easneeer olhtkoeqs 10 ge! 

bie sotdgixoaeh edt mi evitostte stom yldsdory ST , 3% 


sit to esuiev obtotqmyas seam sd? gotau aadt solkyaded te 


yy, 


Summary and Conclusions 


To determine the effects of different probability schedules 
of reinforcement over a series of successive reversals in a 
two-choice learning situation, five groups of rats were run in a 
modified two-bar Skinner Box under 1.00, .90, .80, .70,: arid. «60 


equal probability schedules or reinforcement. 


The different schedules of reinforcement had differential 
effects on the terminal response levels. In general, the final 
mean asymptotes of the different groups tended to match their 
respective probability schedules of reinforcement. The inadequacy 
of analyzing grouped data in terms of mean A, response levels 
was -demonstrated by the fact that individual Ss did not match 


their respective probability schedules of reinforcement. 


A statistical learning theory interpretation of the data 
in regard to the Po values was also found to be inadequate. The 
extremely high Po values were somewhat clarified by an error analysis 
of the data. This analysis supported a win-stay, lose-shift strategy 


interpretation of the data. 


In general, the findings of the error analysis suggest that 
the results of the present study may best be interpreted as an 
interaction of a pre-experimentally induced strategy or pattern of 
responding with different probability schedules of reinforcement. 
It was found that this interpretation was congruent with the general 
shape of the curves, whereas the statistical learning theory approach 


was not. 


wc sa 
eee | 


vaisbenve yiiiioedsug + vamctihts 2 afn8ity out sitioneteh of 
7 


* : — : eT 
a wi 2lAnveve ov tséisoaue Jo cabins 4 seve tnemso ot 


> 
¢ - 


b nb atm o1v Btar Io agnor svit .cobleieie galmasal 


OO. Oia OF ae oe A? ,00 Vi Tah 296 vraanit + t8d-¢ o 


- 
dnomeowe bate to poduibedos YP Lt Sado 


_ 


q 
Je lsnsislilb ban tnaneorotidiss Id salubaiosc jae rs? £ 


lwnrt ac -lerung Gl. valével sanoquer Leaketet Git a 
_ 
Sic node of betas egier Jnotel leh oz te 8 


ne Ko 
Yosupekeni oeAT  .drspsono'terlos Yo eetuibailer yoLtidedas ovksos sot 


afove: saqvoqier pA magn to. eires ai atab beguog pak 
4 7 - 


= 


fovec on bib sa Letiivigerl fede tost ext yd 2B 
items uroleist? Yo eealabations viiit iido 
BtEt. wis aie rt isn tSUGTed hE ¥y Lost mi 
sit .wotsgpebtntood of Basel oafe ssw. soliiey of odd oF rts) ot 
- . 7 =" » en —_— 


- 7 

stoyieas toaite ia yd Setlitsio ¢adwomme sruy aoniaey of tela \ ‘ 

Wasvowts Jiide-ssol .yeteruiw & Deateoqgre aieyians ataT 
«BORD orld 7 ats 


— 


tect gesuqwe ateytane > ae ee edt ,Letsasg a 


_ 7 


es 


: , 
~ bee qb : 
Se Bs hsjouge tisk od: Bebe: — pie: A 
’ 2) 
io wisdsaq 16 Tystants boostond vas tyromtbs: - al | 
= rs \- ae a pet + ie ; 
. tfameoso'tatet So, AOhe five yt ii tdadore Bf. 


a ‘ : ‘i oq 
ferondos (Sit) doi 
J 7 


. ji a 
ai 3-4 


M5, 


It was concluded that an error or sequential response analysis 
of the data would probably be a more fruitful approach for this type 
of study as averaged or mean response data may or may not reflect 
the underlying mechanism, sets, or strategies controlling behavior. 
The present study supports the contentions of some researchers 
(Anderson, 1960; Anderson and Whalen, 1960; Engler, 1958; Overall 
and Brown, 1957; and Witte, 1961). Their view is succinctly 
summarized by Witte (1961), "that since behavior is apparently 
a function of more remote events, as well as the immediately preceding 


event, an analysis of sequence effects is mandatory. " 


sisylane Seacgqest faigneapen 
says “tus cot soscagea Lines tine roma shdndong Bivow stab 
osltes ton yer to yan antes senagenr. Apetn "0 bagatsve & | “iE y 
wiveiled gatlloigass edigsdeiss to edge bao & 
arodpyesset ama ‘te anoitaadged “sett etzeqque oun $2 
[isrevi ;HCL .zeteat 200es so ban Bi aoetabak ,08eL 
‘pltonioons 2f wety 2 fed? Cites RIE Bae ee m 
yitdonmdte ot ve lvetisd sodia eds" ,(R8GE) stdiw 5 
yniivesuy vietsibeumi siv ep Lisw ae ,Strsvs sites, stom te sorts ‘4 


Ld 


.ynotehiem al wtoette spistpes To 6: 


6, 


References 


Anderson, N. H. An analysis of sequential dependencies. In R. R. 
Bush and W. K. Estes (eds.), Studies in mathematical learnin 
theory. Stanford: Stanford University Press, 1959, pp 248-26). 


Anderson, N. H. Effect of first-order conditional probability in 
a two-choice learning situation. J. exp. Psychol., 190, 59, 
Tonos 


Anderson, N. H. Comments on Professor Estes’ Paper. A. W. Melton 


(ed. ) In Proceedings of the Michigan ONR conference on human 
learning. (1962) 


Anderson, N. H. & Grant, D. A. A test of statistical learning 
theory model for two-choice behavior with double stimulus 


events. J. exp, Psychol. , “L957, "545-305-317, 


Anderson, N. H. & Grant, D. A. Correction and reanalysis. J. exp. 
Psychol., 1958, 56, 453-454. 


Anderson, N. H. & Hovland, C. I. The presentation of order effects 
in communication research. In C. I. Hovland (ed.) The order 


of presentation in persuasion. New Haven: Yale University 
Press, 1957, pp. 158-169. 


Anderson, N. H. & Whalen, R. E. Likelihood judgments and sequential 
effects in a two-choice probability learning situation. J. exp. 
Psychol., 1960, 60, 11-120. 


Atkinson, R.C. An analysis of the effect of nonreinforced trials 
in terms of statistical learning theory. J. exp. Psychol., 1956, 
52, 28-32. 


Bitterman, M. E., Wodinsky, J., Candland, D. K. Some comparative 
psychology. Amer. J. Psychol., 1958, 71, 94-110. 


Brookshire, K. H., Warren, J. M., & Ball, G. E. Reversal and transfer 
learning following overtraining in rat and chicken. J. comp. 


physiol. Psychol., 1961, 54, 98-102. 


Brunswick, E. Probability as a determiner of rat behavior. Jd. exp. 
Psychol. 5 1939, 25, 175-197. 


Bush, R. R. and Mosteller, F. A mathematical model for simple learning. 
Psychol. Rev., 1951, 58, 313-323. 


Bush, R. R. and Mosteller, F. Stochastic models for learning. New York: 
Wiley, 1955. 


is ve Li iuadorg, sing : oe 3 i 


. “ ? . ie he's 


«WA 


at 


at ise 1£duob pert tok 


ad oe ro? lobom road 
EOE ee + dedoyed see = 


‘ee. cteytonees Bra nbjivedesoo .A . iow Sate re 


Pad. sel cue ORE a 


atostts tabio 3b usitavngraig SMe «kl 10 ybaatvon 4 = am 


sovtabah 


tebo> ety ( 53) baste ivot, “a oo tel 
yoletevial sLsY~ :nevel weil otek 


-foLesee: ack 


latvumpes 698 etnsmyent boodEteALL .F .A .nsledt 3 
e .b .ootteatie anthwesl yriLidedos? sotois-owd. 6 ae 
4USE-Li 0d CORE gs 


aisixt Bastoimiesion Yo doatto ed¢ to aieytans mA 
S21 hoa os 6b senosdt gaintbel Lotte isan Fe 


rere. 
LAE 


i? 6 8 
"Sic hat has oka pa § ae. . ot , 


ee 


wes 
quo .t .oako dtp’ bad tas ine tr 


spintesl efembe wot: 2 


sato¥ wot 


3528) ; 
, a 


> ail “Roe it an - 


. 1 * 


me i 


x 


o, 
oe 


' 


47. 


Buytendijk, R. J. J. Uber das Umlernen. Arch. neurol. Physiol., 
1930, 15, 283-310. 


Capaldi, E. J. and Stevenson, H. W. Response reversal following 


different amounts of training. J. comp. physiol. Psychol., 
1957 5° 505. 195198. 


Davidson, D., Suppes, P., and Siegel, S. Some experiments and 
related theory in the measurement of utility and subjective 
probability. In D. Davidson, P. Suppes, and S. Siegel (eds.) 


Decision making: an experimental approach. Stanford: 
Stanford University Press, 1957, pp. 19-81. 

Detambel, M. H. A test of a model for multiple-choice behavior. 
J. exp. Psychol., 1955, 49, 2, 97-104. 


Dufort, R. H., Guttman, N., and Kimble, G. A. One trial discrimination 


reversal in the white rat. J. comp. physiol. Psychol., 1954, 47, 
248-29, 


Edwards, A. L. Experimental design in psychological research. New York: 
Holt, Rinehart and Winston, 19600. 


Edwards, W. The theory of decision making. Psychol. Bull., 1954, 
Bly 300-417. 


Edwards, W. Reward probability, amount, and information as determiners 
of sequential two-alternative decisions. J. exp. Psychol., 1956, 
A BY il Nf 


Edwards, W. Probability learning in 1000 trials. J. exp. Psychol., 
1961, 62, 385-39}. 


Engler, J. Marginal and conditional stimulus and response probabilities 


in verbal conditioning. J. exp. Psychol., 1958, 55, 303-317. 


Estes, W. K. Toward a statistical theory of learning. Psychol. Rev., 


1950, 57, 94-107. 


Estes, W. K. Individual behavior in uncertain situations: an 
interpretation in terms of statistical association theory. 
In R. A. Thrall, C. H. Coombs, & R. L. Davis (eds.) Decision 
processes, New York: Wiley, 1954, pp. 141-149. 


Estes, W. K. & Burke, C. J. A theory of stimulus variability in 
learning. Psychol. Rev., 1953, 60, 276-286. 


Estes, W. K. & Burke, C. J. Application of a statistical model to 
simple discrimination learning in human subjects. J. exp. Psychol., 
1955, 50, 81-88. 


C_- 


av 


or, , or 
ee a - mie 7 
; gab sedtt ob al a AG ber 
| 


bs (OLE-EBS eth OERL 


es 


‘yedemevet? bas bd ape 


2 To hat agg trots TEE meas pi 
o : ; G6 f- BRL ,0€ , yeee 
RE 


, to keyol , Lortres . asl 


iwollo® fgestoven setagesh 
ts 


, Jono ib bevs 
aml ni a. 


on 
: : =~ P | cee ae 
Dns tdnemisodss sme <6 pa fre ..1 ,saqee ,.0 game ab kvedt 
 Ssvitesi dus bas Yili tp inom Eden ait ai ywrosds Deva LOT 7: . 
(abso) lopete .@ Be pie Met Re rf. i aon ol = ive él dade 7 


a 


HPiotnacé 6. osrngn, Lede bieges cs tandem ap 5a 
Loe Od ede evar an ieevidl . Tomas 


toftraned sxiodo-~- shia Sebtahon ea to test 
MOLY .S 84 . cee! 7 


% 


nolijenimcizoe Lh LAPS? snk A wt olsen Seas , a _ Namty 


ye ,#e@] ,.fonlovad be bry gt pai 3 » adkite ont nk : 


aed Letcominyat has 
el ,gatentW bra JuaatonEe «ttoll 


| es ¢ . oa ; * a . q 
Pel , ies pangs Mtinenm odietoeb to w:roeit st 


= 
7 "- 
siofnlensieb sa #80 iigorrotai Sut .fawoms ,ytilidadotq Brawei* 
,Beet . «Lon <Gxe .b .anotetoel sridantetle-ows Ledemepeg to 
cy io care q 
BLT SE 

i iolity a wc 
-cisine OOOL mi safimtieo! yt iitdadagt ia 
<2 


, Logos 


sell ti tdsdoxw senegusy fuse exlamive Epnot>ibaos bas Lank 
-TRE-“20E 2? LUCE “Vio GRO 5 -uitino fy tas 


.¥9E ,lotoved eae To cad fesisaleta v pial 
: OLR 


oe racobdseuttia atbetro gn wk iia ‘ipsdnall 
.Viosts ie ide tosses isoiteioeee To eurmes al no btetes 
Aoksiost:-(. she) spat it oes joey LED tient? ~ 


QL £85 sag iia ial  paY, wal: 
ni vt bbihele by atticmise Ss yapett A : ) 
[ DOSS OTS itch ee 3 fe 


o¢ tebom Laotesisars. # 20 soidaoi 
Sos oye Gee ol : eens cosmid nt 3 


48. 


Estes, W. K., Burke, C. J., Atkinson, R. C., & Frankmann, J. P. 
Probabilistic discrimination learning. J. exp. Psychol., 
LOOT aur ta, 233-239. 


Estes, W. K. & Schoeffler, M. S. Analysis of variables influencing 


alternation after forced trials. J. comp. physiol. Psychol., 
NOD sO sendy ( ~30L. 


Estes, W. K. & Straughn, J. H. Analysis of a verbal conditioning 
situation in terms of statistical learning theory. J. exp. 
Psychol., 1954, 47, 4, 225-234. 


Friedman, M. P., Burke, C..J., Cole, M., Estes, W. K. & Millward, R. B. 
Extended training in a noncontingent two-choice situation with 
shifting reinforcement probabilities. (Presented at lst meeting 
Psychonomic Soc., Chicago, I1l., Sept. 1-3, 1960.) 


Gardner, R. A. Probability-learning with two and three choices. 
Amer. J. sPayenol.., 1957, 70, 174-185. 


Goodnow, J. J. (1951). Replication of Stanley's Experiment 1950. 


In R. R. Bush & F. Mosteller Stochastic models for learning. 
DeWeLOnmee “nie WELey, L955. 


Goodnow,, J.-J. & Pettigrew, T. F. Effect of prior patterns of 
experience upon strategies and learning sets. J. exp. Psychol., 
1955, 49, 381-389. 


Grant, D. A... Hake, H. W. &.Hornseth, J. P. Acquisition and.extinction 
of verbal conditioning response with differing percentage of 


reinforcement. J. exp. Psychol., 1951, 42, 1, 1-5. 


Guttman, W. Operant conditioning, extinction, and periodic reinforcement 
in relation to concentration of sucrose solution. J. exp. Psychol., 
1953, 46, 213-22, 


Humphreys, L. E. Acquisition and extinction of verbal expectations in 
a situation analogous to conditioning. J. exp. Psychol., 1939, 
25, 294-300. 


Jarvik, M. E. Probability learning and a negative recency effect in 
the serial anticipation of alternate symbols. J. exp. Psychol., 
1951, 41, 291-297. 


Koronakos, C. & Arnold, W. J. The formation of learning sets in rats. 


J. comp. physiol. Psychol., 1957, 50, 11-14. 


Krechevsky, D. A. Study of the continuity of the problem solving 
processes. Psychol. Rev., 1938, 45, 107-134. 


A . ,ooewlaest O19 A ito bt. WA ..t .0., saat , ee fy : 
. Lose gro. .gadiesd nod mimtrosth oltekiidedot’ 
RES-EES Reg 1a 


toneultal asivetusy to.etefens, 2 M ,19fTieodsa & An 
; ” teins acid peoto? rette ao bian wate 7a 
. L4€- TRE Bil s cy iL 


giigoitibaos Ledrev # to alaglagk A .. ,ntigserde id a 
$2.) -yrooll guirisel Iaoisaisate to amet at notgands 
WES-CES fT HART eefodoyed 


. brow LE ih 4.3.0 paced ,.M ,.9869 ,.0 .>,.catat 9 I GMa 
itiv coiteutia sotodb-ow? tusgabtncenon 2 of polars Geb pobaadxs 
atiteom tel fe hetassett) ae ht khidsdotg Joomote'tn bet BA cis 
(.O3@L ,€-L .dyeR , dhl yogacim ,.208 3: exwindlergit 

-Redtoso sesdd Bos ows Mie goinvesl-~ys ilisedott ae 2 
BBIMVE LOY TEL « Lodgi “a 


% et ee a'yelnes@ to do ites iiqoh ALR) 


oa sintas! sot sigbom otfeafoosde wilodaoM .% a cane 
2OL elit .L oak y 


to anmretgeq tolwg te Yoot?t = .7 .T woryisied s) a 
: dogoysl .gxe .b .ste8 guinieel bas 2eigoterts nege sone hrege 
; ; tj NS~SBE 5 


noitohiexe bas notitiainupoA 4 .t Ties iol 3 JW OR oe 


30 opétasorsq gaireltib atiw sanogasc” gnino isaeees 
cmt -£ Se .1cel .-Jornyed gee . .taom 


4 


ee otter obhoivag hae ,fotterisns .goinoitibuos taaaege - 
yok te, .b soatH here snore. ‘to acissiinsaccos of BOE 
SS-£45 


nt gagttetosgxe Iediev To cokfotitxe bas ooit te bopoA Sad 
Cel ,.potoyet axe. .gaitioicibaas os ae mots 


tk ooStts pemeagir de svidgenes © bas anitrizel yielded’ | 
Iovoyad. Sita) sh» aLocinge nArnONe YW seottaqinisnus, 


. TES-12S 


‘ mae mi agae- pstaniat to" ee oe uci WwW cbiomtA 4 
Biles ree Lorgyet 


Paw 


Pel a 


49, 


Lauer, D. W. & Estes, W. K. Observed and predicted terminal 
distribution response probability under two conditions of 
random reinforcement. Amer. Psychologist, 1954, 9, 413. 


Mackintosh, N. J. The effects of overtraining on a reversal 


and a nonreversal shift. J. comp. physiol. Psychol., 1962, 
Tat eee Oo. 


Millward, R. B. A comparison of two learning models for two-choice 
conditioning experiments involving nonreinforced trials. 
Unpublished doctoral dissertation, Indiana University, 1960. 


Morse, E. B. & Runquist, W. N. Probability-matching with an 
unscheduled random sequence. Amer. J. Psychol., 1960, 73, 
603-607. 


Neimark, E. D. Effects of type of non-reinforcement and number of 
alternative responses in two verbal conditioning situations. 


J. exp. Psychol., 1956, 52, 209-220. 


Neimark, E. D., and Shuford, E. H. Comparisons of predictions and 
estimates in a probability learning situation. J. exp. Psychol., 
Loo. at 5) 294-98. 


North, A. J. Improvement in successive discrimination reversals. 


J. comp. physiol. Psychol., 1950, 43, 4k4e-h6o (a). 


North, A. J. Performance during an extended series of discrimination 


reversals, J. comp. physiol. Psychol., 1950, 43, 461-470 (be); 


North, A. J. & Clayton, K. N. Irrelevant stimuli and degree of 
learning in discrimination learning and reversal. Psychol. 


Repts., 1959, 5, 405-408. 


Overall, J. E. & Brown, W. L. Recency, frequency, and probability in 
response prediction. Psychol. Rev., 1957, 64, 314~-323. 


Parducci, A. & Polt, J. Correction vs. noncorrection with changing 
reinforcement schedules. J. comp. physiol. Psychol., 195s. Si. 
492-99. 


Pubols, B. H. The facilitation of visual and spatial discrimination 


reversal by overlearning. J. comp. physiol. Psychol., 1956, 49, 
243 ~2U8, 


Reid, L. S. The development of noncontinuity behavior through 
continuity learning. J. exp. Psychol., 1953, 46, 107-112. 


. he 
ea — 


[antertst basoBbetg bre “spats ¥ W ,eetad ot 
to end Et aHbo ee Beirels tg vias © ieee oe bag nt 
.£f¢ 0 ,APOy (setgofodoyet ; remore tates mobnet — 
ave 
faetsvet 2 fo adinkesdrevs % ayoetis off .t A. * 
(SOL , fotioyet-loteyig .gmby 4 .Jline feeteventod #8 
ORR 2E RR 


sotorio-ows rol aleBom ac hrcnet we Yo Hoaiteqmms A” .a A gt 
«Beir byoso Tia wen gatviovat precemiene- 64° grikao. iirc 
.O86L .ytbstavinl aneiiel ,motierressin isiotoob boda bE¢ ” art 
oe 1 ee 
7 
na itiw ontiio sae ekhbetedsel “WW .cekapaod & a . eG 

sky OPE e Ladies): . bagi etl mobaat helubedos Ty 


to tedmin bone Josseetlebed=nem to st hk 
.snoiveudis aiinotd ionos Lsdaev ows ai soar 
,OS VOSS asc ,ocel ct 


23 .¢xe..% nokteul fe gates! ethh kdedorey a nt eetamtsas 
) e 
ch R-HQS c ee a 


ee: oe. 
bie eeodtotbosy to aod Peegino? 4 .4 .btotords bag at mere tomton 
ete - } 


-algersvsy uoiventiittodib svisas>oye at tasmeyvougll, u a 
+i, \4 a ~ ed a z 
(a) Pham Sigs ft Ee Cel 4 Lasove ny ted: ue — are bs 


io itn betee he to ashyen Sbebitstxe as gaiussh Scam tees 
“ft 7) ov. roy fe, Cea L t* I orisy es * ‘he & ei GOD it % 


to s8tgoh bas tlimtis tasvefowtl 1 2 ee 
foioysi .feerevert bas girbrtisel so ltasiatroa kb f 
. Du~EOH# .e PRE a 


ni ywktidedow: £48 yyoneipert .yonsses 1 WW pod ee . 
 SSé-e re #0. Veet . «yeh fotoyst .ooltotbeng eam 


gitinieto dtiw noifseattjouon .ey noitestm) .0 


BRR L ¢ kgulayed -Jeteydy .qms3 Lok -egiuberoa 


Ot ORCL a 


sane eee Yrs 
“SLE YOL oi! c iar _fotiny 


! _ 


oe 
— 


ls 


Siegel, 5S. Theoretical models of choice and strategy behavior: 
Stable state behavior in the two-choice uncertain outcome 


situation. Psychometrika, 1959, 24, 303-316. 


Siegel, S. & Goldstein, D. A. Decision-making behavior in a two- 
choice uncertain outcome situation. J. exp. Psychol., 1959, 
57, 37-42. 


Stanley, J. C. Jr. (1950). The differential effects of partial and 
continuous reward upon the acquisition and elimination of a 
runway response in a two choice situation. In R. R. Bush and 


F. Mosteller Stochastic models for learning. New York: J. Wiley 
Sens «Lory. 


purecch, EH. G. A., MeGonigle, B. & Rodger, R. &/ Serial position 
reversal learning in the rat: a preliminary analysis of 


training criteria. J. comp. physiol. Psychol., 1963, 56, 
719-72e. 


Uhl, C. N. Two-choice probability learning in the rat as a function 
of incentive, probability of reinforcement, and training procedure. 


J. exp. Psychol., 1963, 66, 443-hhg. 

von Neumann, J. & Morgenstern, 0. Theory of games and economic 
behavior. Princeton: Princeton University Press, 1949, 
(2nd ed. 


Wilson, W. A. Two-choice behavior of monkeys. J. exp. Psychol., 
1960, 59, 207-208. 


Wilson, W. A. & Rollin, A. R. Two-choice behavior of rhesus monkeys 
in a noncontingent situation. J. exp. Psychol., 1959, 58, 174-180. 


Williams, S. B. Reversal learning after two degrees of training. 


J. comp. physiol. Psychol., 1942, 353-360. 


Winer, B. J. Statistical principles in experimental design. New York: 
McGraw-Hill, 1962. 


Witte, R. S. Conditional response probability in a T-maze. J. exp. 
Psychol., 1961, 62, 439.447. 


108 


:ioweded yasdentte brik aeRode,” te wine fevtte 
commodore mbedeSumy ouki tin ame edt ai witvaded stage = 


SLE-E0E lS (CRT adidammcoyet Ae bores 


-ow? 8 ol iobystiod gm bitmaesstoag. .A .d ,iiadebiod & a ; 
RRL |. Lotsea see saoktes?ic omootuvo Aletxsons « Pray 


; Stee ball 


‘ ‘ - qa 
bus leitteg to ata tie, fekidersteis off .(08es) 4b sD ab; 

e To doifeslnile fag gobttéivpne orig noe y Diewet 290m 
Dus Hevt .4 648 81 “,nGitapiie sacono owl « Si sanagest 4 


coli .% jainoY wel "ee 257 sienon ohtcetioore calleteoM . 
, | ARR gt es 


rolsieoq [stis®. ~<a ~1w9ehen 6 .A . olan, .# ee er 
io efaylenas yreiéimiisag 2 :tax sit of amketast [anus 


OC ,CORD -eolaiivel: siotavey sgt .b 6a EraF £19 | gota : — 
soa \nk i) 


ik 
t] sag 


ot 


P : ‘o or 
holiomy a as cet oot me gearrtsot ysil ide 1am persre Fe of a 
wiubssong pulincery Bns ,Jasgesictiarss to yi i fedomey - -ovisasoat 9 
Wet) ,dd 60! _. Louver en = 
sg teep 2 
D2Mehol9 Das Bemiey tp weds iT tetera 2 a. am Sr “ghey 
ryt Il ., 2297) Ylversvial modsonksS -noseaial * sairated, 


y 
Sr 
« a 
¢ Lodoyes gts .G ~ysAsom to toivadsd soteib-owy 
aysinom suas to tobvSledsatodo-owl fA LA 
JOB L-HTL .00 (Oct. dodoved sexe st nobhewcbe 2 


-aninls ? to sossyeh oui tote muntnzesl Lestevel 
Odk=82f S¥eL ¢- Loto By ere : ig 


s2ivoY wel oo basb—tatoomre <9" eB ig iter. Lar se tna Pt 
a Saet “Tiwana 


gee. .ssane Ta ni yeti aadoag: santo ao% Iawotdibaod h Hod: Iw 
net ted Re 8 odoyns ; 


parts . 


APPENDIX A 


Individual A, responses for the 1.00 group 
for the 30 probability-reversals 


LO; CO 1. OF Vi = oe | 


SLs 


ae 


a 
ree 
f 


ae 
wake & 
a 


7/0 3 P 4 


“A(r ra rH 
€ 
» 


fo -—- 

- my 3 Vet ; 

) S—7n2 
= ee 
Se 


SN en ne ne 
Fj 
} r : wey ) Cre 
a . . AD. f £ 4 ¥ 


he) vewont 
is Ma ri bea ; 


APPENDIX A(continued) 


Individual Ay responses for the .90 group 
for the 30 probability-reversals 


Day 


SOV COON al Se CORDS | 


quo O08, a : 
alsataysiy 


ce ¢ 
ri 
bf 
te 
a 


RUBS eaES SeekE” = Cr EM inh 


‘ hipe i 
: } ry ka ES 
He ay oF Va OF ig 
‘ re EG GE ag *¢ 
" A ; ; oF 
ae 


in 
= 
Fs 
fa) 

Let 
~ 
vi ¢ 
ch 


— 
£& 
as 
fa 
= 

A 


S 
Q 
™ 
ea oS 
Pr tee 
oh 
Ce 
ae t 


2S 


mr) 

nee o> 
ba / 

cay 

RBH 

re 
Peis 


7 sees 2 ore 
wy 1A rh bP 


A 
vine) 


ema | 
: he jan PY 


APPENDIX A (continued) 


Individual A, responses for the .80 group 
for the 30 probability-reversals 


Day 


23+ 


i i iN} 


; Mian ' 
hae =e Wi y 7 


qrovs te. tte ae vila fas 


abea-overt-ONREbiadlora OF aust call 


A 


> “ 


bey OF 


eS 


*> fy try Or-4 Bs 
ms eee ee 


APPENDIX A (continued) 


Individual A; responses for the .70 group 
for the 30 probability-reversals 


Day 


5h, 


> 
A ie 


mc 
Os Ua 


ESASbRAA RAPA 14, 


| BR BARRE he FF 


LALLAEERGRFRUNAERRO Leth 


. 
Lg 


iu 


er 
i“ 


4S 


APPENDIX A (continued) 


Individual A, responses for the .60 group 
for the 30 probability-reversals 


D2 


quorg 08. sift 
5 a th tunes gu ‘edd ene 


as 

‘GE 

HE 
18 ; 

e ; 

SS sf a &S 
se i os ae 
ef e ES es 
s+ mg Be IS 2 
\@ mo 68 ec is 
ita a a ge 
it e ft 
RE ct @5 iy § 
lé @ oS #8 

. OF a 

as of 


